[Qemu-block] [Qemu-devel][PATCH] block: modify top-id's comments
Signed-off-by: Wen CongyangSigned-off-by: Changlong Xie Signed-off-by: Wang WeiWei Signed-off-by: zhanghailiang Signed-off-by: Gonglei Reviewed-by: Eric Blake --- qapi/block-core.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index ada3202..0935b81 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2184,7 +2184,7 @@ # @mode: the replication mode # # @top-id: #optional In secondary mode, node name or device ID of the root -# node who owns the replication node chain. Ignored in primary mode. +# node who owns the replication node chain. Must not be given in primary mode. # # Since: 2.8 ## -- 1.9.3
Re: [Qemu-block] [Qemu-devel] [PATCH 3/3] linux-aio: fix re-entrant completion processing
On Tue, 09/27 19:55, Roman Penyaev wrote: > > The bug is 100% deterministic. Just boot up a guest with -drive > > format=qcow2,aio=native. > > It turns out to be that everything is broken. I started all my > tests with format=raw,aio=native and immediately got coroutine > recursive. That is completely weird. > > So, what I did is the following: > > 1. Took latest master (nothing works) > 2. Did interactive rebase to 12c8720 > 12c8720 2016-06-28 | Merge remote-tracking branch > 'remotes/stefanha/tags/block-pull-request' into staging [Peter > Maydell] > > this merge request includes all your patches related to > virtio-blk and MQ support. > > 3. Applied 0ed93d84edab. Everything works fine. Have you tried qcow2 at this point? raw crashes with 1a62d0accdf85 doesn't mean qcow2 is fine without it. Fam > > 4. Rebased up till 0647d47: > 0647d47 2016-09-13 | qcow2: avoid memcpy(dst, NULL, len) [Stefan Hajnoczi] > > this is the point, after which 0ed93d84edab was applied > on master. > > Got recursive coroutine, so nothing works. > > 5. Did a besect, which shows this commit: > > -- > commit 1a62d0accdf85fbeac149018ee8d1728e754de73 > Author: Eric Blake> Date: Fri Jul 15 12:31:59 2016 -0600 > > block: Fragment reads to max transfer length > -- > > So after this commit my commit 0ed93d84edab stops working. > And now for me is completely not clear what is happening there. > > -- > Roman >
Re: [Qemu-block] [PATCH v2 3/3] linux-aio: fix re-entrant completion processing
On Tue, 09/27 16:18, Stefan Hajnoczi wrote: > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process > completions from ioq_submit()") added an optimization that processes > completions each time ioq_submit() returns with requests in flight. > This commit introduces a "Co-routine re-entered recursively" error which > can be triggered with -drive format=qcow2,aio=native. > > Fam Zheng, Kevin Wolf , and I > debugged the following backtrace: > > (gdb) bt > #0 0x70a046f5 in raise () at /lib64/libc.so.6 > #1 0x70a062fa in abort () at /lib64/libc.so.6 > #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at > util/qemu-coroutine.c:113 > #3 0x55a4b663 in qemu_laio_process_completions > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 > #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at > block/linux-aio.c:331 > #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, > type=type@entry=1) at block/linux-aio.c:383 > #6 0x55a4bbd3 in laio_co_submit (bs=, > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at > block/linux-aio.c:402 > #7 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, > offset=offset@entry=2932727808, bytes=bytes@entry=8192, > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804 > #8 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, > req=req@entry=0x59d38d20, offset=offset@entry=2932727808, > bytes=bytes@entry=8192, align=align@entry=512, > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041 > #9 0x55a52db8 in bdrv_co_preadv (child=, > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, > flags=flags@entry=0) at block/io.c:1133 > #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at > block/qcow2.c:1509 > #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804 > #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, > req=req@entry=0x59d39000, offset=offset@entry=6178725888, > bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, > flags=0) at block/io.c:1041 > #13 0x55a52db8 in bdrv_co_preadv (child=, > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 > #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at > block/block-backend.c:783 > #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at > block/block-backend.c:991 > #16 0x55ac0cfa in coroutine_trampoline (i0=, > i1=) at util/coroutine-ucontext.c:78 > > It turned out that re-entrant ioq_submit() and completion processing > between three requests caused this error. The following check is not > sufficient to prevent recursively entering coroutines: > > if (laiocb->co != qemu_coroutine_self()) { > qemu_coroutine_enter(laiocb->co); > } > > As the following coroutine backtrace shows, not just the current > coroutine (self) can be entered. There might also be other coroutines > that are currently entered and transferred control due to the qcow2 lock > (CoMutex): > > (gdb) qemu coroutine 0x583464d0 > #0 0x55ac0c90 in qemu_coroutine_switch > (from_=from_@entry=0x583464d0, to_=to_@entry=0x572f9890, > action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175 > #1 0x55abfe54 in qemu_coroutine_enter (co=0x572f9890) at > util/qemu-coroutine.c:117 > #2 0x55ac031c in qemu_co_queue_run_restart > (co=co@entry=0x583462c0) at util/qemu-coroutine-lock.c:60 > #3 0x55abfe5e in qemu_coroutine_enter (co=0x583462c0) at > util/qemu-coroutine.c:119 > #4 0x55a4b663 in qemu_laio_process_completions > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 > #5 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at > block/linux-aio.c:331 > #6 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, > laiocb=laiocb@entry=0x5a338b40, offset=offset@entry=2911477760, > type=type@entry=1) at block/linux-aio.c:383 > #7 0x55a4bbd3 in laio_co_submit (bs=, > s=0x57e2f7f0, fd=13, offset=2911477760, qiov=0x5a338e80, type=1) at > block/linux-aio.c:402 > #8 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, > offset=offset@entry=2911477760, bytes=bytes@entry=8192, > qiov=qiov@entry=0x5a338e80, flags=0) at block/io.c:804 > #9 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, >
Re: [Qemu-block] [PATCH v2 2/3] test-coroutine: test qemu_coroutine_entered()
On Tue, 09/27 16:18, Stefan Hajnoczi wrote: > Signed-off-by: Stefan Hajnoczi> --- > tests/test-coroutine.c | 42 ++ > 1 file changed, 42 insertions(+) > > diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c > index 6431dd6..abd97c2 100644 > --- a/tests/test-coroutine.c > +++ b/tests/test-coroutine.c > @@ -53,6 +53,47 @@ static void test_self(void) > } > > /* > + * Check that qemu_coroutine_entered() works > + */ Not related to this patch: It's a bit weird that in this file function header comments are followed by a blank line, and in one case it even looks like as odd as this: static void test_order(void) { int i; const struct coroutine_position expected_pos[] = { {1, 1,}, {2, 1}, {1, 2}, {2, 2}, {1, 3} }; do_order_test(); g_assert_cmpint(record_pos, ==, 5); for (i = 0; i < record_pos; i++) { g_assert_cmpint(records[i].func , ==, expected_pos[i].func ); g_assert_cmpint(records[i].state, ==, expected_pos[i].state); } } /* * Lifecycle benchmark */ static void coroutine_fn empty_coroutine(void *opaque) { /* Do nothing */ } > + > +static void coroutine_fn verify_entered_step_2(void *opaque) > +{ > +Coroutine *caller = (Coroutine *)opaque; > + > +g_assert(qemu_coroutine_entered(caller)); > +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); > +qemu_coroutine_yield(); > + > +/* Once more to check it still works after yielding */ > +g_assert(qemu_coroutine_entered(caller)); > +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); > +qemu_coroutine_yield(); > +} > + > +static void coroutine_fn verify_entered_step_1(void *opaque) > +{ > +Coroutine *self = qemu_coroutine_self(); > +Coroutine *coroutine; > + > +g_assert(qemu_coroutine_entered(self)); > + > +coroutine = qemu_coroutine_create(verify_entered_step_2, self); > +g_assert(!qemu_coroutine_entered(coroutine)); > +qemu_coroutine_enter(coroutine); > +g_assert(!qemu_coroutine_entered(coroutine)); > +qemu_coroutine_enter(coroutine); > +} > + > +static void test_entered(void) > +{ > +Coroutine *coroutine; > + > +coroutine = qemu_coroutine_create(verify_entered_step_1, NULL); > +g_assert(!qemu_coroutine_entered(coroutine)); > +qemu_coroutine_enter(coroutine); > +} > + > +/* > * Check that coroutines may nest multiple levels > */ > > @@ -389,6 +430,7 @@ int main(int argc, char **argv) > g_test_add_func("/basic/yield", test_yield); > g_test_add_func("/basic/nesting", test_nesting); > g_test_add_func("/basic/self", test_self); > +g_test_add_func("/basic/entered", test_entered); > g_test_add_func("/basic/in_coroutine", test_in_coroutine); > g_test_add_func("/basic/order", test_order); > if (g_test_perf()) { > -- > 2.7.4 > Reviewed-by: Fam Zheng
Re: [Qemu-block] [PATCH v2 1/3] coroutine: add qemu_coroutine_entered() function
On Tue, 09/27 16:18, Stefan Hajnoczi wrote: > See the doc comments for a description of this new coroutine API. > > Signed-off-by: Stefan Hajnoczi> --- > include/qemu/coroutine.h | 13 + > util/qemu-coroutine.c| 5 + > 2 files changed, 18 insertions(+) > > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h > index 29a2078..e6a60d5 100644 > --- a/include/qemu/coroutine.h > +++ b/include/qemu/coroutine.h > @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); > */ > bool qemu_in_coroutine(void); > > +/** > + * Return true if the coroutine is currently entered > + * > + * A coroutine is "entered" if it has not yielded from the current > + * qemu_coroutine_enter() call used to run it. This does not mean that the > + * coroutine is currently executing code since it may have transferred > control > + * to another coroutine using qemu_coroutine_enter(). > + * > + * When several coroutines enter each other there may be no way to know which > + * ones have already been entered. In such situations this function can be > + * used to avoid recursively entering coroutines. > + */ > +bool qemu_coroutine_entered(Coroutine *co); > > > /** > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 3cbf225..737bffa 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void) > self->caller = NULL; > qemu_coroutine_switch(self, to, COROUTINE_YIELD); > } > + > +bool qemu_coroutine_entered(Coroutine *co) > +{ > +return co->caller; > +} > -- > 2.7.4 > Reviewed-by: Fam Zheng
Re: [Qemu-block] [PATCH v24 11/12] support replication driver in blockdev-add
在 2016年09月12日 22:01, Stefan Hajnoczi 写道: On Mon, Aug 15, 2016 at 05:32:19PM +0800, Changlong Xie wrote: On 08/15/2016 04:37 PM, Kevin Wolf wrote: Am 15.08.2016 um 03:49 hat Changlong Xie geschrieben: On 08/09/2016 05:08 PM, Kevin Wolf wrote: Am 27.07.2016 um 09:01 hat Changlong Xie geschrieben: From: Wen CongyangSigned-off-by: Wen Congyang Signed-off-by: Changlong Xie Signed-off-by: Wang WeiWei Signed-off-by: zhanghailiang Signed-off-by: Gonglei Reviewed-by: Eric Blake @@ -2078,6 +2079,23 @@ { 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] } ## +# @BlockdevOptionsReplication +# +# Driver specific block device options for replication +# +# @mode: the replication mode +# +# @top-id: #optional In secondary mode, node name or device ID of the root +# node who owns the replication node chain. Ignored in primary mode. Can we change this to "Must not be given in primary mode"? Not sure what the code currently does, but I think it should error out if top-id is Replication driver will ignore "top-id" parameter in Primary mode. This is not good behaviour, which is why I requested a change. Hi stefan Would you like me send another [PATCH v25] based your block-next? Or a separate patch until your tree is merged. Sorry for the slow response. Please send a new patch on top of my block-next tree. Stefan ok, I will send a new patch on the top of block-next tree.
Re: [Qemu-block] [PATCH v24 00/12] Block replication for continuous checkpoints
在 2016年09月12日 22:11, Stefan Hajnoczi 写道: On Mon, Aug 08, 2016 at 03:50:27PM +0100, Stefan Hajnoczi wrote: On Wed, Jul 27, 2016 at 03:01:41PM +0800, Changlong Xie wrote: Block replication is a very important feature which is used for continuous checkpoints(for example: COLO). You can get the detailed information about block replication from here: http://wiki.qemu.org/Features/BlockReplication Usage: Please refer to docs/block-replication.txt You can get the patch here: https://github.com//Pating/qemu/tree/block-replication-v24 You can get the patch with framework here: https://github.com//Pating/qemu/tree/colo_framework_v23 TODO: 1. Continuous block replication. It will be started after basic functions are accepted. Change Log: V24: 1. Address comments from Max p9: pass NULL to bdrv_lookup_bs(), and introduce bdrv_is_root_node() to check top_bs p11: perfect @top-id description, and make it #optional p12: "replication" => "Replication", add docs/block-replication.txt Note: we need bdrv_is_root_node() in p9, so this patchset is based on kevin/qmp-node-name, V23: 1. Address comments from Stefan and Max, this series introduce p7/p12 p2. add Copyright for block_backup.h p7. support configure --disable-replication p8. update 2.7 to 2.8 p11. update 2.7 to 2.8, add missing "top-id" p12. update MAINTAINERS V22: 1. Rebase to the lastest code 2. modify code adapt to the modification of backup_start & commit_active_start 3. rewrite io_read & io_write for interface changes V21: 1. Rebase to the lastest code 2. use bdrv_pwrite_zeroes() and BDRV_SECTOR_BITS for p9 V20 Resend: 1. Resend to avoid bothering qemu-trivial maintainers 2. Address comments from Eric, fix header file issue and add a brief commit message for p7 V20: 1. Rebase to the lastest code 2. Address comments from stefan p8: 1. error_setg() with an error message when check_top_bs() fails. 2. remove bdrv_ref(s->hidden_disk->bs) since commit 5c438bc6 3. use bloc_job_cancel_sync() before active commit p9: 1. fix uninitialized 'pattern_buf' 2. introduce mkstemp(3) to fix unique filenames 3. use qemu_vfree() for qemu_blockalign() memory 4. add missing replication_start_all() 5. remove useless pattern for io_write() V19: 1. Rebase to v2.6.0 2. Address comments from stefan p3: a new patch that export interfaces for extra serialization p8: 1. call replication_stop() before freeing s->top_id 2. check top_bs 3. reopen file readonly in error return paths 4. enable extra serialization between read and COW p9: try to hanlde SIGABRT V18: p6: add local_err in all replication callbacks to prevent "errp == NULL" p7: add missing qemu_iovec_destroy(xxx) V17: 1. Rebase to the lastest codes p2: refactor backup_do_checkpoint addressed comments from Jeff Cody p4: fix bugs in "drive_add buddy xxx" hmp commands p6: add "since: 2.7" p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication p8: add "since: 2.7" V16: 1. Rebase to the newest codes 2. Address comments from Stefan & hailiang p3: we don't need this patch now p4: add "top-id" parameters for secondary p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, add doc comments that explain the semantics of Replication p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs() *Note*: I'm working on replication testcase now, will send out in V17 V15: 1. Rebase to the newest codes 2. Fix typos and coding style addresed Eric's comments 3. Address Stefan's comments 1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver 2) Update the message and description for [PATCH 4/9] 3) Make replication_(start/stop/do_checkpoint)_all as global interfaces 4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks 5) Use BdrvChild instead of holding on to BlockDriverState * pointers 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771 5. Introduce replication_get_error_all to check replication status 6. Remove useless discard interface V14: 1. Implement auto complete active commit 2. Implement active commit block job for replication.c 3. Address the comments from Stefan, add replication-specific API and data structure, also remove old block layer APIs V13: 1. Rebase to the newest codes 2. Remove redundant marcos and semicolon in replication.c 3. Fix typos in block-replication.txt V12: 1. Rebase to the newest codes 2. Use backing reference to replcace 'allow-write-backing-file' V11: 1. Reopen the backing file when starting blcok replication if it is not opened in R/W mode 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET when opening backing file 3. Block the top BDS so there is only one block job for the top BDS and its backing chain. V10: 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing reference. 2. Address the comments from Eric Blake V9: 1. Update the error messages 2. Rebase to the newest qemu 3. Split child add/delete
Re: [Qemu-block] [Qemu-devel] [PATCH v14 04/19] qapi: add trace events for visitor
On 09/27/2016 08:13 AM, Daniel P. Berrange wrote: > Allow tracing of the operation of visitors Ooooh, shiny! > > Signed-off-by: Daniel P. Berrange> --- > Makefile.objs | 1 + > qapi/qapi-visit-core.c | 27 +++ > qapi/trace-events | 33 + > 3 files changed, 61 insertions(+) > create mode 100644 qapi/trace-events > Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests
On 09/27/2016 05:10 PM, Eric Blake wrote: > On 09/27/2016 08:13 AM, Daniel P. Berrange wrote: >> The input_visitor_test_add() method was accepting an instance >> of 'TestInputVisitorData' and passing it as the 'user_data' >> parameter to test functions. The main 'TestInputVisitorData' >> instance that was actually used, was meanwhile being allocated >> automatically by the test framework fixture setup. >> >> Signed-off-by: Daniel P. Berrange>> --- >> tests/test-qobject-input-visitor.c | 76 >> -- >> 1 file changed, 32 insertions(+), 44 deletions(-) >> > > Reviewed-by: Eric Blake > Having said that, I note that ALL callers now pass NULL for user_data. If you plan on using it later in the series for something other than NULL for some of the (new?) tests added at that point, it would be wise to say so in the commit message; if not, I would suggest eliminating the parameter altogether. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests
On 09/27/2016 08:13 AM, Daniel P. Berrange wrote: > The input_visitor_test_add() method was accepting an instance > of 'TestInputVisitorData' and passing it as the 'user_data' > parameter to test functions. The main 'TestInputVisitorData' > instance that was actually used, was meanwhile being allocated > automatically by the test framework fixture setup. > > Signed-off-by: Daniel P. Berrange> --- > tests/test-qobject-input-visitor.c | 76 > -- > 1 file changed, 32 insertions(+), 44 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH v14 03/19] option: allow qemu_opts_to_qdict to merge repeated options
On 09/27/2016 08:13 AM, Daniel P. Berrange wrote: > If given an option string such as > > size=1024,nodes=10,nodes=4-5,nodes=1-2,policy=bind > > the qemu_opts_to_qdict() method will currently overwrite > the values for repeated option keys, so only the last > value is in the returned dict: > > size=1024 > nodes=1-2 > policy=bind > > This adds the ability for the caller to ask that the > repeated keys be turned into list indexes: > > size=1024 > nodes.0=10 > nodes.1=4-5 > nodes.2=1-2 > policy=bind > > Note that the conversion has no way of knowing whether > any given key is expected to be a list upfront - it can > only figure that out when seeing the first duplicated > key. Thus the caller has to be prepared to deal with the > fact that if a key 'foo' is a list, then the returned > qdict may contain either 'foo' (if only a single instance > of the key was seen) or 'foo.NN' (if multiple instances > of the key were seen). > > Signed-off-by: Daniel P. Berrange> --- If I'm not mistaken, this policy adds a new policy, but all existing clients use the old policy, and the new policy is exercised only by the testsuite additions. Might be useful to mention that in the commit message, rather than making me read the whole commit before guessing that. > +++ b/blockdev.c > @@ -911,7 +911,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, > BlockInterfaceType block_default_type) > > /* Get a QDict for processing the options */ > bs_opts = qdict_new(); > -qemu_opts_to_qdict(all_opts, bs_opts); > +qemu_opts_to_qdict(all_opts, bs_opts, > + QEMU_OPTS_REPEAT_POLICY_LAST); git send-email/format-patch -O/path/to/file (or the corresponding config option) allows you to sort the diff to put the interesting stuff first (in this case, the new enum). > +++ b/include/qemu/option.h > @@ -125,7 +125,13 @@ void qemu_opts_set_defaults(QemuOptsList *list, const > char *params, > int permit_abbrev); > QemuOpts *qemu_opts_from_qdict(QemuOptsList *list, const QDict *qdict, > Error **errp); > -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict); > +typedef enum { > +QEMU_OPTS_REPEAT_POLICY_LAST, > +QEMU_OPTS_REPEAT_POLICY_LIST, Hmm. I suspect this subtle difference (one vowel) to be the source of typo bugs. Can we come up with more obvious policy names, such as LAST_ONLY vs. INTO_LIST? Except that doing that makes it harder to fit 80 columns. So up to you if you want to ignore me here. On the other hand, a documentation comment here would go a long ways to helping future readers: LAST: last occurrence of a duplicate option silently overwrites all earlier LIST: each occurrence of a duplicate option converts it into a list maybe you also want to add: ERROR: an occurrence of a duplicate option is considered an error Also, while you turn 'foo=a,foo=b' into 'foo.0=a,foo.1=b', does your code correctly handle the cases of 'foo.0=a,foo=b' and 'foo=a,foo.1=b'? (And what IS the correct handling of those cases logically supposed to be?) > +++ b/tests/test-qemu-opts.c > @@ -421,6 +421,45 @@ static void test_qemu_opts_set(void) > g_assert(opts == NULL); > } > > + > +static void test_qemu_opts_to_qdict(void) > +{ Here would be a good place to test the two mixed-use optstrings I mentioned above (inconsistent use of plain vs. list syntax). > +} > + > int main(int argc, char *argv[]) > { > +++ b/util/qemu-option.c > @@ -1058,10 +1058,12 @@ void qemu_opts_absorb_qdict(QemuOpts *opts, QDict > *qdict, Error **errp) > * TODO We'll want to use types appropriate for opt->desc->type, but > * this is enough for now. > */ > -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict) > +QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict, > + QemuOptsRepeatPolicy repeatPolicy) > { > QemuOpt *opt; > -QObject *val; > +QObject *val, *prevval; > +QDict *lists = qdict_new(); > > if (!qdict) { > qdict = qdict_new(); > @@ -1070,9 +1072,42 @@ QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict) > qdict_put(qdict, "id", qstring_from_str(opts->id)); > } > QTAILQ_FOREACH(opt, >head, next) { > +gchar *key = NULL; > val = QOBJECT(qstring_from_str(opt->str)); > -qdict_put_obj(qdict, opt->name, val); > +switch (repeatPolicy) { > +case QEMU_OPTS_REPEAT_POLICY_LIST: > +if (qdict_haskey(lists, opt->name)) { > +/* Current val goes into 'foo.N' */ > +int64_t max = qdict_get_int(lists, opt->name); > +max++; > +key = g_strdup_printf("%s.%" PRId64, opt->name, max); > +qdict_put_obj(lists, opt->name, QOBJECT(qint_from_int(max))); > +qdict_put_obj(qdict, key, val); > +} else if (qdict_haskey(qdict, opt->name)) { > +/* Move previous
Re: [Qemu-block] [Qemu-devel] [PATCH v14 01/19] qdict: implement a qdict_crumple method for un-flattening a dict
On 09/27/2016 08:13 AM, Daniel P. Berrange wrote: > The qdict_flatten() method will take a dict whose elements are > further nested dicts/lists and flatten them by concatenating > keys. > > The qdict_crumple() method aims to do the reverse, taking a flat > qdict, and turning it into a set of nested dicts/lists. It will > apply nesting based on the key name, with a '.' indicating a > new level in the hierarchy. If the keys in the nested structure > are all numeric, it will create a list, otherwise it will create > a dict. > > If the keys are a mixture of numeric and non-numeric, or the > numeric keys are not in strictly ascending order, an error will > be reported. > > > The intent of this function is that it allows a set of QemuOpts > to be turned into a nested data structure that mirrors the nesting > used when the same object is defined over QMP. > > Reviewed-by: Kevin Wolf> Reviewed-by: Marc-André Lureau > Signed-off-by: Daniel P. Berrange > --- > + > +/** > + * qdict_split_flat_key: > + * @key: the key string to split > + * @prefix: non-NULL pointer to hold extracted prefix > + * @suffix: non-NULL pointer to remaining suffix > + * > + * Given a flattened key such as 'foo.0.bar', split it into two parts > + * at the first '.' separator. Allows double dot ('..') to escape the > + * normal separator. > + * > + * eg s/eg/e.g./ or just spell it as 'for example' > +static int qdict_is_list(QDict *maybe_list, Error **errp) > + > +/* NB this isn't a perfect check - eg it won't catch Another such use. > + * a list containing '1', '+1', '01', '3', but that > + * does not matter - we've still proved that the > + * input is a list. It is up the caller to do a > + * stricter check if desired */ > +if (len != (max + 1)) { > +error_setg(errp, "List indexes are not contiguous, " s/indexes/indices/ ? (my spellchecker likes both, but indexes is a sign that modern English speakers are getting lazy and drifting away from Latin) > +/** > + * qdict_crumple: > + * @src: the original flat dictionary (only scalar values) to crumple > + * @recursive: true to recursively crumple nested dictionaries > + * For example, an input of: > + * > + * { 'foo.0.bar': 'one', 'foo.0.wizz': '1', > + * 'foo.1.bar': 'two', 'foo.1.wizz': '2' } > + * > + * will result in any output of: s/any/an/ > + * > + * { > + * 'foo': [ > + * { 'bar': 'one', 'wizz': '1' }, > + * { 'bar': 'two', 'wizz': '2' } > + * ], > + * } > + * > + * The following scenarios in the input dict will result in an > + * error being returned: > + * > + * - Any values in @src are non-scalar types > + * - If keys in @src imply that a particular level is both a > + *list and a dict. eg, "foo.0.bar" and "foo.eek.bar". > + * - If keys in @src imply that a particular level is a list, > + *but the indexes are non-contigous. eg "foo.0.bar" and s/contigous/contiguous/ and another pesky 'eg' Modulo typo fixes and potential grammar changes, Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver
On 27.09.2016 10:20, Kevin Wolf wrote: > Am 26.09.2016 um 18:49 hat Max Reitz geschrieben: >> On 23.09.2016 16:32, Kevin Wolf wrote: >>> The option whether or not to use a native AIO interface really isn't a >>> generic option for all drivers, but only applies to the native file >>> protocols. This patch moves the option in blockdev-add to the >>> appropriate places (raw-posix and raw-win32). >>> >>> We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility >>> because so far the AIO option was usually specified on the wrong layer >>> (the top-level format driver, which didn't even look at it) and then >>> inherited by the protocol driver (where it was actually used). We can't >>> forbid this use except in new interfaces. >>> >>> Signed-off-by: Kevin Wolf>>> --- >>> block/raw-posix.c | 44 --- >>> block/raw-win32.c | 56 >>> +- >>> qapi/block-core.json | 6 +++--- >>> tests/qemu-iotests/087 | 4 ++-- >>> 4 files changed, 83 insertions(+), 27 deletions(-) >> >> [...] >> >>> diff --git a/block/raw-win32.c b/block/raw-win32.c >>> index 56f45fe..734bb10 100644 >>> --- a/block/raw-win32.c >>> +++ b/block/raw-win32.c >> >> [...] >> >>> +static bool get_aio_option(QemuOpts *opts, int flags, Error **errp) >>> +{ >>> +BlockdevAioOptions aio, aio_default; >>> + >>> +aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE >>> + : >>> BLOCKDEV_AIO_OPTIONS_THREADS; >>> +aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, >>> "aio"), >>> + BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp); >>> + >>> +switch (aio) { >>> +case BLOCKDEV_AIO_OPTIONS_NATIVE: >>> +return true; >>> +case BLOCKDEV_AIO_OPTIONS_THREADS: >>> +return false; >>> +default: >>> +error_setg(errp, "Invalid AIO option"); >> >> Any reason for catching this case here but not in raw-posix? >> >> (Not that it really matters, though.) > > Nobody will forget raw-posix when adding a new AIO mode to win32, so I > didn't feel like the additional code was worth it there. But if we add a > new AIO mode to raw-posix, I'm pretty sure we will forget win32. :-) Good point. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PULL 00/18] Block layer patches
On 27 September 2016 at 06:53, Kevin Wolfwrote: > The following changes since commit 7cfdc02dae0d2ff58c897496cfdbbafc0eda0f3f: > > Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging > (2016-09-26 19:47:00 +0100) > > are available in the git repository at: > > > git://repo.or.cz/qemu/kevin.git tags/for-upstream > > for you to fetch changes up to 3b856cebe5e93547852c156ca2119d075e62aed7: > > coroutine: reduce stack size to 60kB (2016-09-27 14:05:21 +0200) > > > Block layer patches > > I see 'make check' failures on x86-64 host, clang Linux: /i386/ahci/migrate/ncq/simple: OK /i386/ahci/migrate/ncq/halted: OK /i386/ahci/cdrom/dma/single: OK /i386/ahci/cdrom/dma/multi: OK /i386/ahci/cdrom/pio/single: Broken pipe FAIL GTester: last random seed: R02Sa8f729848b07c3b3e5ee67368f9d0350 (pid=10590) /i386/ahci/cdrom/pio/multi: Broken pipe FAIL GTester: last random seed: R02Se85704e04bbd382223983c878723b811 (pid=10598) FAIL: tests/ahci-test TEST: tests/hd-geo-test... (pid=10601) /i386/hd-geo/ide/none: OK thanks -- PMM
Re: [Qemu-block] [PATCH v3 1/3] qemu-nbd: Add --fork option
On 27.09.2016 11:04, Kevin Wolf wrote: > Am 25.08.2016 um 18:30 hat Max Reitz geschrieben: >> Using the --fork option, one can make qemu-nbd fork the worker process. >> The original process will exit on error of the worker or once the worker >> enters the main loop. >> >> Suggested-by: Sascha Silbe>> Signed-off-by: Max Reitz >> --- >> qemu-nbd.c | 15 ++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/qemu-nbd.c b/qemu-nbd.c >> index e3571c2..8c2d582 100644 >> --- a/qemu-nbd.c >> +++ b/qemu-nbd.c >> @@ -48,6 +48,7 @@ >> #define QEMU_NBD_OPT_OBJECT260 >> #define QEMU_NBD_OPT_TLSCREDS 261 >> #define QEMU_NBD_OPT_IMAGE_OPTS262 >> +#define QEMU_NBD_OPT_FORK 263 >> >> #define MBR_SIZE 512 >> >> @@ -503,6 +504,7 @@ int main(int argc, char **argv) >> { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, >> { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS }, >> { "trace", required_argument, NULL, 'T' }, >> +{ "fork", no_argument, NULL, QEMU_NBD_OPT_FORK }, >> { NULL, 0, NULL, 0 } >> }; >> int ch; >> @@ -524,6 +526,8 @@ int main(int argc, char **argv) >> bool imageOpts = false; >> bool writethrough = true; >> char *trace_file = NULL; >> +bool fork_process = false; >> +int old_stderr = -1; >> >> /* The client thread uses SIGTERM to interrupt the server. A signal >> * handler ensures that "qemu-nbd -v -c" exits with a nice status code. >> @@ -714,6 +718,9 @@ int main(int argc, char **argv) >> g_free(trace_file); >> trace_file = trace_opt_parse(optarg); >> break; >> +case QEMU_NBD_OPT_FORK: >> +fork_process = true; >> +break; >> } >> } >> >> @@ -773,7 +780,7 @@ int main(int argc, char **argv) >> return 0; >> } >> >> -if (device && !verbose) { >> +if ((device && !verbose) || fork_process) { >> int stderr_fd[2]; >> pid_t pid; >> int ret; >> @@ -796,6 +803,7 @@ int main(int argc, char **argv) >> ret = qemu_daemon(1, 0); >> >> /* Temporarily redirect stderr to the parent's pipe... */ >> +old_stderr = dup(STDERR_FILENO); >> dup2(stderr_fd[1], STDERR_FILENO); >> if (ret < 0) { >> error_report("Failed to daemonize: %s", strerror(errno)); > > I don't have a kernel with NBD support, so I can't test this easily, but > doesn't this break the daemon mode for --connect? I mean, it will still > fork, but won't the parent be alive until the child exits? Well, for me the parent closes as it should. > >> @@ -951,6 +959,11 @@ int main(int argc, char **argv) >> exit(EXIT_FAILURE); >> } >> >> +if (fork_process) { >> +dup2(old_stderr, STDERR_FILENO); >> +close(old_stderr); >> +} > > Because this code doesn't run for --connect (unless --fork is given, > too). Hm, so? It never ran before either, because I'm only just now introducing it. And the fact that I'm keeping the original stderr FD open has nothing to do with when the parent process will quit because the parent process is not connected to that *original* stderr. Also, when using --connect, the FD is closed in nbd_client_thread(). > >> state = RUNNING; >> do { >> main_loop_wait(false); > > The documentation needs an update, too. Right. I wonder why I forgot this. I guess the answer is "Because I wrote this in some spare time at KVM Forum to see if it would work at all"... Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/27/2016 08:04 PM, Paolo Bonzini wrote: > > On 27/09/2016 15:28, Denis V. Lunev wrote: >> On 09/27/2016 03:07 PM, Paolo Bonzini wrote: >>> - Original Message - From: "Denis V. Lunev"To: "Paolo Bonzini" Cc: "Vladimir Sementsov-Ogievskiy" , qemu-de...@nongnu.org, qemu-block@nongnu.org, nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, kw...@redhat.com, stefa...@redhat.com, w...@uter.be Sent: Tuesday, September 27, 2016 12:25:54 PM Subject: Re: [PATCH] proto: add 'shift' extension. On 09/27/2016 01:15 PM, Paolo Bonzini wrote: >> We could go in a different direction and export flag >> 'has_zero_init' which will report that the storage is >> initialized with all zeroes at the moment. In this >> case mirroring code will not fall into this >> branch. > Why don't you add the zero_init flag to QEMU's NBD driver instead? for all cases without knowing real backend on the server side? I think this would be wrong. >>> Add it to the command line, and leave it to libvirt or the user to >>> pass "-drive file.driver=nbd,...,file.zero-init=on". >> I have started with something very similar for 'drive-mirror' command. >> We have added additional flag for this to improve migration speed >> and this was rejected. > You can add it through the filename path too, through a URI option > "nbd://...?zero-init=on". > > Paolo ha, cool idea! Thanks! Den
Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing
On Tue, Sep 27, 2016 at 5:25 PM, Stefan Hajnocziwrote: > On Tue, Sep 27, 2016 at 04:29:55PM +0200, Roman Penyaev wrote: >> On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnoczi wrote: >> > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process >> > completions from ioq_submit()") added an optimization that processes >> > completions each time ioq_submit() returns with requests in flight. >> > This commit introduces a "Co-routine re-entered recursively" error which >> > can be triggered with -drive format=qcow2,aio=native. >> > >> > Fam Zheng , Kevin Wolf , and I >> > debugged the following backtrace: >> > >> > (gdb) bt >> > #0 0x70a046f5 in raise () at /lib64/libc.so.6 >> > #1 0x70a062fa in abort () at /lib64/libc.so.6 >> > #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at >> > util/qemu-coroutine.c:113 >> > #3 0x55a4b663 in qemu_laio_process_completions >> > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 >> > #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at >> > block/linux-aio.c:331 >> > #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, >> > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, >> > type=type@entry=1) at block/linux-aio.c:383 >> > #6 0x55a4bbd3 in laio_co_submit (bs=, >> > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) >> > at block/linux-aio.c:402 >> > #7 0x55a4fd23 in bdrv_driver_preadv >> > (bs=bs@entry=0x5663bcb0, offset=offset@entry=2932727808, >> > bytes=bytes@entry=8192, qiov=qiov@entry=0x59d38e20, flags=0) at >> > block/io.c:804 >> > #8 0x55a52b34 in bdrv_aligned_preadv >> > (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, >> > offset=offset@entry=2932727808, bytes=bytes@entry=8192, >> > align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at >> > block/io.c:1041 >> > #9 0x55a52db8 in bdrv_co_preadv (child=, >> > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, >> > flags=flags@entry=0) at block/io.c:1133 >> > #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, >> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) >> > at block/qcow2.c:1509 >> > #11 0x55a4fd23 in bdrv_driver_preadv >> > (bs=bs@entry=0x56635890, offset=offset@entry=6178725888, >> > bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=0) at >> > block/io.c:804 >> > #12 0x55a52b34 in bdrv_aligned_preadv >> > (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, >> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, >> > align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at >> > block/io.c:1041 >> > #13 0x55a52db8 in bdrv_co_preadv (child=, >> > offset=offset@entry=6178725888, bytes=bytes@entry=8192, >> > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 >> > #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, >> > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at >> > block/block-backend.c:783 >> > #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at >> > block/block-backend.c:991 >> > #16 0x55ac0cfa in coroutine_trampoline (i0=, >> > i1=) at util/coroutine-ucontext.c:78 >> > >> > It turned out that re-entrant ioq_submit() and completion processing >> > between three requests caused this error. The following check is not >> > sufficient to prevent recursively entering coroutines: >> > >> > if (laiocb->co != qemu_coroutine_self()) { >> > qemu_coroutine_enter(laiocb->co); >> > } >> > >> > As the following coroutine backtrace shows, not just the current >> > coroutine (self) can be entered. There might also be other coroutines >> > that are currently entered and transferred control due to the qcow2 lock >> > (CoMutex): >> >> I doubt that that was introduced by the commit you've specified: >> 0ed93d84edab. >> >> Before my patch coroutine was unconditionally entered. The following >> is what was changed by 0ed93d84edab: >> >> if (laiocb->co) { >> -qemu_coroutine_enter(laiocb->co); >> +/* Jump and continue completion for foreign requests, don't do >> + * anything for current request, it will be completed shortly. */ >> +if (laiocb->co != qemu_coroutine_self()) { >> +qemu_coroutine_enter(laiocb->co); >> +} > > Unconditionally entering was safe prior to 0ed93d84edab since all > coroutines yielded and qemu_coroutine_entered() would be false all the > time. Therefore it wasn't necessary to protect against re-entering a > coroutine. > >> If you have a strong reproduction, could you please verify that. > > The bug is 100% deterministic. Just boot up a guest with -drive > format=qcow2,aio=native. It turns out to be that everything is broken. I started all my
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/27/2016 03:07 PM, Paolo Bonzini wrote: > > - Original Message - >> From: "Denis V. Lunev">> To: "Paolo Bonzini" >> Cc: "Vladimir Sementsov-Ogievskiy" , >> qemu-de...@nongnu.org, qemu-block@nongnu.org, >> nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, >> kw...@redhat.com, stefa...@redhat.com, >> w...@uter.be >> Sent: Tuesday, September 27, 2016 12:25:54 PM >> Subject: Re: [PATCH] proto: add 'shift' extension. >> >> On 09/27/2016 01:15 PM, Paolo Bonzini wrote: We could go in a different direction and export flag 'has_zero_init' which will report that the storage is initialized with all zeroes at the moment. In this case mirroring code will not fall into this branch. >>> Why don't you add the zero_init flag to QEMU's NBD driver instead? >> for all cases without knowing real backend on the server side? >> I think this would be wrong. > Add it to the command line, and leave it to libvirt or the user to > pass "-drive file.driver=nbd,...,file.zero-init=on". > > Paolo this specifically means that all QMP commands like 'drive-backup' and 'drive-mirror' will have to be modified to pass this attribute. If this is OK, we can proceed with that. Den
Re: [Qemu-block] [Qemu-devel] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function
On 27/09/2016 18:29, Stefan Hajnoczi wrote: > On Tue, Sep 27, 2016 at 5:20 PM, Paolo Bonziniwrote: >> >> >> On 27/09/2016 16:06, Stefan Hajnoczi wrote: >>> See the doc comments for a description of this new coroutine API. >>> >>> Signed-off-by: Stefan Hajnoczi >>> --- >>> include/qemu/coroutine.h | 13 + >>> util/qemu-coroutine.c| 5 + >>> 2 files changed, 18 insertions(+) >>> >>> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h >>> index 29a2078..e6a60d5 100644 >>> --- a/include/qemu/coroutine.h >>> +++ b/include/qemu/coroutine.h >>> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); >>> */ >>> bool qemu_in_coroutine(void); >>> >>> +/** >>> + * Return true if the coroutine is currently entered >>> + * >>> + * A coroutine is "entered" if it has not yielded from the current >>> + * qemu_coroutine_enter() call used to run it. This does not mean that the >>> + * coroutine is currently executing code since it may have transferred >>> control >>> + * to another coroutine using qemu_coroutine_enter(). >>> + * >>> + * When several coroutines enter each other there may be no way to know >>> which >>> + * ones have already been entered. In such situations this function can be >>> + * used to avoid recursively entering coroutines. >>> + */ >>> +bool qemu_coroutine_entered(Coroutine *co); >> >> Perhaps qemu_coroutine_running is a better name? > > I find "running" confusing since the coroutine may not actually be > currently executing (as mentioned in the doc comment). Ok, makes sense. Another possibility is qemu_coroutine_on_stack, but I'm not sure it's better... Paolo
Re: [Qemu-block] [Qemu-devel] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function
On Tue, Sep 27, 2016 at 5:20 PM, Paolo Bonziniwrote: > > > On 27/09/2016 16:06, Stefan Hajnoczi wrote: >> See the doc comments for a description of this new coroutine API. >> >> Signed-off-by: Stefan Hajnoczi >> --- >> include/qemu/coroutine.h | 13 + >> util/qemu-coroutine.c| 5 + >> 2 files changed, 18 insertions(+) >> >> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h >> index 29a2078..e6a60d5 100644 >> --- a/include/qemu/coroutine.h >> +++ b/include/qemu/coroutine.h >> @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); >> */ >> bool qemu_in_coroutine(void); >> >> +/** >> + * Return true if the coroutine is currently entered >> + * >> + * A coroutine is "entered" if it has not yielded from the current >> + * qemu_coroutine_enter() call used to run it. This does not mean that the >> + * coroutine is currently executing code since it may have transferred >> control >> + * to another coroutine using qemu_coroutine_enter(). >> + * >> + * When several coroutines enter each other there may be no way to know >> which >> + * ones have already been entered. In such situations this function can be >> + * used to avoid recursively entering coroutines. >> + */ >> +bool qemu_coroutine_entered(Coroutine *co); > > Perhaps qemu_coroutine_running is a better name? I find "running" confusing since the coroutine may not actually be currently executing (as mentioned in the doc comment). Stefan
Re: [Qemu-block] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function
On 27/09/2016 16:06, Stefan Hajnoczi wrote: > See the doc comments for a description of this new coroutine API. > > Signed-off-by: Stefan Hajnoczi> --- > include/qemu/coroutine.h | 13 + > util/qemu-coroutine.c| 5 + > 2 files changed, 18 insertions(+) > > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h > index 29a2078..e6a60d5 100644 > --- a/include/qemu/coroutine.h > +++ b/include/qemu/coroutine.h > @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); > */ > bool qemu_in_coroutine(void); > > +/** > + * Return true if the coroutine is currently entered > + * > + * A coroutine is "entered" if it has not yielded from the current > + * qemu_coroutine_enter() call used to run it. This does not mean that the > + * coroutine is currently executing code since it may have transferred > control > + * to another coroutine using qemu_coroutine_enter(). > + * > + * When several coroutines enter each other there may be no way to know which > + * ones have already been entered. In such situations this function can be > + * used to avoid recursively entering coroutines. > + */ > +bool qemu_coroutine_entered(Coroutine *co); Perhaps qemu_coroutine_running is a better name? Paolo > > > /** > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 3cbf225..737bffa 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void) > self->caller = NULL; > qemu_coroutine_switch(self, to, COROUTINE_YIELD); > } > + > +bool qemu_coroutine_entered(Coroutine *co) > +{ > +return co->caller; > +} >
Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing
Hey Stefan, On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnocziwrote: > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process > completions from ioq_submit()") added an optimization that processes > completions each time ioq_submit() returns with requests in flight. > This commit introduces a "Co-routine re-entered recursively" error which > can be triggered with -drive format=qcow2,aio=native. > > Fam Zheng , Kevin Wolf , and I > debugged the following backtrace: > > (gdb) bt > #0 0x70a046f5 in raise () at /lib64/libc.so.6 > #1 0x70a062fa in abort () at /lib64/libc.so.6 > #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at > util/qemu-coroutine.c:113 > #3 0x55a4b663 in qemu_laio_process_completions > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 > #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at > block/linux-aio.c:331 > #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, > type=type@entry=1) at block/linux-aio.c:383 > #6 0x55a4bbd3 in laio_co_submit (bs=, > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at > block/linux-aio.c:402 > #7 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, > offset=offset@entry=2932727808, bytes=bytes@entry=8192, > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804 > #8 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, > req=req@entry=0x59d38d20, offset=offset@entry=2932727808, > bytes=bytes@entry=8192, align=align@entry=512, > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041 > #9 0x55a52db8 in bdrv_co_preadv (child=, > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, > flags=flags@entry=0) at block/io.c:1133 > #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at > block/qcow2.c:1509 > #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804 > #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, > req=req@entry=0x59d39000, offset=offset@entry=6178725888, > bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, > flags=0) at block/io.c:1041 > #13 0x55a52db8 in bdrv_co_preadv (child=, > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 > #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at > block/block-backend.c:783 > #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at > block/block-backend.c:991 > #16 0x55ac0cfa in coroutine_trampoline (i0=, > i1=) at util/coroutine-ucontext.c:78 > > It turned out that re-entrant ioq_submit() and completion processing > between three requests caused this error. The following check is not > sufficient to prevent recursively entering coroutines: > > if (laiocb->co != qemu_coroutine_self()) { > qemu_coroutine_enter(laiocb->co); > } > > As the following coroutine backtrace shows, not just the current > coroutine (self) can be entered. There might also be other coroutines > that are currently entered and transferred control due to the qcow2 lock > (CoMutex): I doubt that that was introduced by the commit you've specified: 0ed93d84edab. Before my patch coroutine was unconditionally entered. The following is what was changed by 0ed93d84edab: if (laiocb->co) { -qemu_coroutine_enter(laiocb->co); +/* Jump and continue completion for foreign requests, don't do + * anything for current request, it will be completed shortly. */ +if (laiocb->co != qemu_coroutine_self()) { +qemu_coroutine_enter(laiocb->co); +} If you have a strong reproduction, could you please verify that. What worries me is the following: 1. Issue was introduced before and was unnoticed (ok). 2. Issue - is something else and/or was really introduced by commit 0ed93d84edab (not ok). Of course the 2. is not nice. Thanks. -- Roman
Re: [Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing
On Tue, Sep 27, 2016 at 04:29:55PM +0200, Roman Penyaev wrote: > On Tue, Sep 27, 2016 at 4:06 PM, Stefan Hajnocziwrote: > > Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process > > completions from ioq_submit()") added an optimization that processes > > completions each time ioq_submit() returns with requests in flight. > > This commit introduces a "Co-routine re-entered recursively" error which > > can be triggered with -drive format=qcow2,aio=native. > > > > Fam Zheng , Kevin Wolf , and I > > debugged the following backtrace: > > > > (gdb) bt > > #0 0x70a046f5 in raise () at /lib64/libc.so.6 > > #1 0x70a062fa in abort () at /lib64/libc.so.6 > > #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at > > util/qemu-coroutine.c:113 > > #3 0x55a4b663 in qemu_laio_process_completions > > (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 > > #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at > > block/linux-aio.c:331 > > #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, > > laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, > > type=type@entry=1) at block/linux-aio.c:383 > > #6 0x55a4bbd3 in laio_co_submit (bs=, > > s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at > > block/linux-aio.c:402 > > #7 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, > > offset=offset@entry=2932727808, bytes=bytes@entry=8192, > > qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804 > > #8 0x55a52b34 in bdrv_aligned_preadv > > (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, > > offset=offset@entry=2932727808, bytes=bytes@entry=8192, > > align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at > > block/io.c:1041 > > #9 0x55a52db8 in bdrv_co_preadv (child=, > > offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, > > flags=flags@entry=0) at block/io.c:1133 > > #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, > > offset=6178725888, bytes=8192, qiov=0x57527840, flags=) > > at block/qcow2.c:1509 > > #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, > > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > > qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804 > > #12 0x55a52b34 in bdrv_aligned_preadv > > (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, > > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > > align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at > > block/io.c:1041 > > #13 0x55a52db8 in bdrv_co_preadv (child=, > > offset=offset@entry=6178725888, bytes=bytes@entry=8192, > > qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 > > #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, > > offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at > > block/block-backend.c:783 > > #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at > > block/block-backend.c:991 > > #16 0x55ac0cfa in coroutine_trampoline (i0=, > > i1=) at util/coroutine-ucontext.c:78 > > > > It turned out that re-entrant ioq_submit() and completion processing > > between three requests caused this error. The following check is not > > sufficient to prevent recursively entering coroutines: > > > > if (laiocb->co != qemu_coroutine_self()) { > > qemu_coroutine_enter(laiocb->co); > > } > > > > As the following coroutine backtrace shows, not just the current > > coroutine (self) can be entered. There might also be other coroutines > > that are currently entered and transferred control due to the qcow2 lock > > (CoMutex): > > I doubt that that was introduced by the commit you've specified: > 0ed93d84edab. > > Before my patch coroutine was unconditionally entered. The following > is what was changed by 0ed93d84edab: > > if (laiocb->co) { > -qemu_coroutine_enter(laiocb->co); > +/* Jump and continue completion for foreign requests, don't do > + * anything for current request, it will be completed shortly. */ > +if (laiocb->co != qemu_coroutine_self()) { > +qemu_coroutine_enter(laiocb->co); > +} Unconditionally entering was safe prior to 0ed93d84edab since all coroutines yielded and qemu_coroutine_entered() would be false all the time. Therefore it wasn't necessary to protect against re-entering a coroutine. > If you have a strong reproduction, could you please verify that. The bug is 100% deterministic. Just boot up a guest with -drive format=qcow2,aio=native. I noticed that I forgot to include a second backtrace in the commit description. I am resending the patch series with the missing backtrace. Hopefully that will make the bug clearer. Stefan signature.asc Description: PGP signature
Re: [Qemu-block] [PATCH v3 3/3] doc: Document driver-specific -blockdev options
On 09/26/2016 10:27 AM, Kevin Wolf wrote: > This documents the driver-specific options for the raw, qcow2 and file > block drivers for the man page. For everything else, we refer to the > QAPI documentation. > > Signed-off-by: Kevin Wolf> --- > qemu-options.hx | 115 > +++- > 1 file changed, 114 insertions(+), 1 deletion(-) > Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-block] [PATCH v2 3/3] linux-aio: fix re-entrant completion processing
Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process completions from ioq_submit()") added an optimization that processes completions each time ioq_submit() returns with requests in flight. This commit introduces a "Co-routine re-entered recursively" error which can be triggered with -drive format=qcow2,aio=native. Fam Zheng, Kevin Wolf , and I debugged the following backtrace: (gdb) bt #0 0x70a046f5 in raise () at /lib64/libc.so.6 #1 0x70a062fa in abort () at /lib64/libc.so.6 #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at util/qemu-coroutine.c:113 #3 0x55a4b663 in qemu_laio_process_completions (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at block/linux-aio.c:331 #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, type=type@entry=1) at block/linux-aio.c:383 #6 0x55a4bbd3 in laio_co_submit (bs=, s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at block/linux-aio.c:402 #7 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, offset=offset@entry=2932727808, bytes=bytes@entry=8192, qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804 #8 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, offset=offset@entry=2932727808, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041 #9 0x55a52db8 in bdrv_co_preadv (child=, offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, flags=flags@entry=0) at block/io.c:1133 #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at block/qcow2.c:1509 #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804 #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, offset=offset@entry=6178725888, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at block/io.c:1041 #13 0x55a52db8 in bdrv_co_preadv (child=, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at block/block-backend.c:783 #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at block/block-backend.c:991 #16 0x55ac0cfa in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:78 It turned out that re-entrant ioq_submit() and completion processing between three requests caused this error. The following check is not sufficient to prevent recursively entering coroutines: if (laiocb->co != qemu_coroutine_self()) { qemu_coroutine_enter(laiocb->co); } As the following coroutine backtrace shows, not just the current coroutine (self) can be entered. There might also be other coroutines that are currently entered and transferred control due to the qcow2 lock (CoMutex): (gdb) qemu coroutine 0x583464d0 #0 0x55ac0c90 in qemu_coroutine_switch (from_=from_@entry=0x583464d0, to_=to_@entry=0x572f9890, action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175 #1 0x55abfe54 in qemu_coroutine_enter (co=0x572f9890) at util/qemu-coroutine.c:117 #2 0x55ac031c in qemu_co_queue_run_restart (co=co@entry=0x583462c0) at util/qemu-coroutine-lock.c:60 #3 0x55abfe5e in qemu_coroutine_enter (co=0x583462c0) at util/qemu-coroutine.c:119 #4 0x55a4b663 in qemu_laio_process_completions (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 #5 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at block/linux-aio.c:331 #6 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x5a338b40, offset=offset@entry=2911477760, type=type@entry=1) at block/linux-aio.c:383 #7 0x55a4bbd3 in laio_co_submit (bs=, s=0x57e2f7f0, fd=13, offset=2911477760, qiov=0x5a338e80, type=1) at block/linux-aio.c:402 #8 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, offset=offset@entry=2911477760, bytes=bytes@entry=8192, qiov=qiov@entry=0x5a338e80, flags=0) at block/io.c:804 #9 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, req=req@entry=0x5a338d80, offset=offset@entry=2911477760, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x5a338e80, flags=0) at block/io.c:1041 #10 0x55a52db8 in bdrv_co_preadv (child=, offset=2911477760,
[Qemu-block] [PATCH v2 2/3] test-coroutine: test qemu_coroutine_entered()
Signed-off-by: Stefan Hajnoczi--- tests/test-coroutine.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c index 6431dd6..abd97c2 100644 --- a/tests/test-coroutine.c +++ b/tests/test-coroutine.c @@ -53,6 +53,47 @@ static void test_self(void) } /* + * Check that qemu_coroutine_entered() works + */ + +static void coroutine_fn verify_entered_step_2(void *opaque) +{ +Coroutine *caller = (Coroutine *)opaque; + +g_assert(qemu_coroutine_entered(caller)); +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); +qemu_coroutine_yield(); + +/* Once more to check it still works after yielding */ +g_assert(qemu_coroutine_entered(caller)); +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); +qemu_coroutine_yield(); +} + +static void coroutine_fn verify_entered_step_1(void *opaque) +{ +Coroutine *self = qemu_coroutine_self(); +Coroutine *coroutine; + +g_assert(qemu_coroutine_entered(self)); + +coroutine = qemu_coroutine_create(verify_entered_step_2, self); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +} + +static void test_entered(void) +{ +Coroutine *coroutine; + +coroutine = qemu_coroutine_create(verify_entered_step_1, NULL); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +} + +/* * Check that coroutines may nest multiple levels */ @@ -389,6 +430,7 @@ int main(int argc, char **argv) g_test_add_func("/basic/yield", test_yield); g_test_add_func("/basic/nesting", test_nesting); g_test_add_func("/basic/self", test_self); +g_test_add_func("/basic/entered", test_entered); g_test_add_func("/basic/in_coroutine", test_in_coroutine); g_test_add_func("/basic/order", test_order); if (g_test_perf()) { -- 2.7.4
[Qemu-block] [PATCH v2 1/3] coroutine: add qemu_coroutine_entered() function
See the doc comments for a description of this new coroutine API. Signed-off-by: Stefan Hajnoczi--- include/qemu/coroutine.h | 13 + util/qemu-coroutine.c| 5 + 2 files changed, 18 insertions(+) diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 29a2078..e6a60d5 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); */ bool qemu_in_coroutine(void); +/** + * Return true if the coroutine is currently entered + * + * A coroutine is "entered" if it has not yielded from the current + * qemu_coroutine_enter() call used to run it. This does not mean that the + * coroutine is currently executing code since it may have transferred control + * to another coroutine using qemu_coroutine_enter(). + * + * When several coroutines enter each other there may be no way to know which + * ones have already been entered. In such situations this function can be + * used to avoid recursively entering coroutines. + */ +bool qemu_coroutine_entered(Coroutine *co); /** diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index 3cbf225..737bffa 100644 --- a/util/qemu-coroutine.c +++ b/util/qemu-coroutine.c @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void) self->caller = NULL; qemu_coroutine_switch(self, to, COROUTINE_YIELD); } + +bool qemu_coroutine_entered(Coroutine *co) +{ +return co->caller; +} -- 2.7.4
[Qemu-block] [PATCH v2 0/3] linux-aio: fix "Co-routine re-entered recursively" error
v2: * Add missing backtrace in Patch 3 It's possible to hit the "Co-routine re-entered recursively" error with -drive format=qcow2,aio=native. This is a regression introduced by a linux-aio.c optimization. See Patch 3 for details. Patches 1 & 2 add a new coroutine API called qemu_coroutine_entered() for checking whether a coroutine is currently "entered". This makes it possible to avoid re-entering coroutines recursively. Stefan Hajnoczi (3): coroutine: add qemu_coroutine_entered() function test-coroutine: test qemu_coroutine_entered() linux-aio: fix re-entrant completion processing block/linux-aio.c| 9 ++--- include/qemu/coroutine.h | 13 + tests/test-coroutine.c | 42 ++ util/qemu-coroutine.c| 5 + 4 files changed, 66 insertions(+), 3 deletions(-) -- 2.7.4
[Qemu-block] [PATCH 3/3] linux-aio: fix re-entrant completion processing
Commit 0ed93d84edabc7656f5c998ae1a346fe8b94ca54 ("linux-aio: process completions from ioq_submit()") added an optimization that processes completions each time ioq_submit() returns with requests in flight. This commit introduces a "Co-routine re-entered recursively" error which can be triggered with -drive format=qcow2,aio=native. Fam Zheng, Kevin Wolf , and I debugged the following backtrace: (gdb) bt #0 0x70a046f5 in raise () at /lib64/libc.so.6 #1 0x70a062fa in abort () at /lib64/libc.so.6 #2 0x55ac0013 in qemu_coroutine_enter (co=0x583464d0) at util/qemu-coroutine.c:113 #3 0x55a4b663 in qemu_laio_process_completions (s=s@entry=0x57e2f7f0) at block/linux-aio.c:218 #4 0x55a4b874 in ioq_submit (s=s@entry=0x57e2f7f0) at block/linux-aio.c:331 #5 0x55a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x59d38ae0, offset=offset@entry=2932727808, type=type@entry=1) at block/linux-aio.c:383 #6 0x55a4bbd3 in laio_co_submit (bs=, s=0x57e2f7f0, fd=13, offset=2932727808, qiov=0x59d38e20, type=1) at block/linux-aio.c:402 #7 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x5663bcb0, offset=offset@entry=2932727808, bytes=bytes@entry=8192, qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:804 #8 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x5663bcb0, req=req@entry=0x59d38d20, offset=offset@entry=2932727808, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x59d38e20, flags=0) at block/io.c:1041 #9 0x55a52db8 in bdrv_co_preadv (child=, offset=2932727808, bytes=8192, qiov=qiov@entry=0x59d38e20, flags=flags@entry=0) at block/io.c:1133 #10 0x55a29629 in qcow2_co_preadv (bs=0x56635890, offset=6178725888, bytes=8192, qiov=0x57527840, flags=) at block/qcow2.c:1509 #11 0x55a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x56635890, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=0) at block/io.c:804 #12 0x55a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x56635890, req=req@entry=0x59d39000, offset=offset@entry=6178725888, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x57527840, flags=0) at block/io.c:1041 #13 0x55a52db8 in bdrv_co_preadv (child=, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x57527840, flags=flags@entry=0) at block/io.c:1133 #14 0x55a4515a in blk_co_preadv (blk=0x566356d0, offset=6178725888, bytes=8192, qiov=0x57527840, flags=0) at block/block-backend.c:783 #15 0x55a45266 in blk_aio_read_entry (opaque=0x577025e0) at block/block-backend.c:991 #16 0x55ac0cfa in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:78 It turned out that re-entrant ioq_submit() and completion processing between three requests caused this error. The following check is not sufficient to prevent recursively entering coroutines: if (laiocb->co != qemu_coroutine_self()) { qemu_coroutine_enter(laiocb->co); } As the following coroutine backtrace shows, not just the current coroutine (self) can be entered. There might also be other coroutines that are currently entered and transferred control due to the qcow2 lock (CoMutex): (gdb) qemu coroutine 0x583464d0 Use the new qemu_coroutine_entered() function instead of comparing against qemu_coroutine_self(). This is correct because: 1. If a coroutine is not entered then it must have yielded to wait for I/O completion. It is therefore safe to enter. 2. If a coroutine is entered then it must be in ioq_submit()/qemu_laio_process_completions() because otherwise it would be yielded while waiting for I/O completion. Therefore it will check laio->ret and return from ioq_submit() instead of yielding, i.e. it's guaranteed not to hang. Reported-by: Fam Zheng Tested-by: Fam Zheng Signed-off-by: Stefan Hajnoczi --- block/linux-aio.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index d4e19d4..1685ec2 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -94,9 +94,12 @@ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb) laiocb->ret = ret; if (laiocb->co) { -/* Jump and continue completion for foreign requests, don't do - * anything for current request, it will be completed shortly. */ -if (laiocb->co != qemu_coroutine_self()) { +/* If the coroutine is already entered it must be in ioq_submit() and + * will notice laio->ret has been filled in when it eventually runs + * later. Coroutines cannot be entered recursively so avoid doing + * that! + */ +if (!qemu_coroutine_entered(laiocb->co)) {
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/27/2016 03:07 PM, Paolo Bonzini wrote: > > - Original Message - >> From: "Denis V. Lunev">> To: "Paolo Bonzini" >> Cc: "Vladimir Sementsov-Ogievskiy" , >> qemu-de...@nongnu.org, qemu-block@nongnu.org, >> nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, >> kw...@redhat.com, stefa...@redhat.com, >> w...@uter.be >> Sent: Tuesday, September 27, 2016 12:25:54 PM >> Subject: Re: [PATCH] proto: add 'shift' extension. >> >> On 09/27/2016 01:15 PM, Paolo Bonzini wrote: We could go in a different direction and export flag 'has_zero_init' which will report that the storage is initialized with all zeroes at the moment. In this case mirroring code will not fall into this branch. >>> Why don't you add the zero_init flag to QEMU's NBD driver instead? >> for all cases without knowing real backend on the server side? >> I think this would be wrong. > Add it to the command line, and leave it to libvirt or the user to > pass "-drive file.driver=nbd,...,file.zero-init=on". > > Paolo I have started with something very similar for 'drive-mirror' command. We have added additional flag for this to improve migration speed and this was rejected. Den
[Qemu-block] [PULL 01/18] block: reintroduce bdrv_flush_all
From: John SnowCommit fe1a9cbc moved the flush_all routine from the bdrv layer to the block-backend layer. In doing so, however, the semantics of the routine changed slightly such that flush_all now used blk_flush instead of bdrv_flush. blk_flush can fail if the attached device model reports that it is not "available," (i.e. the tray is open.) This changed the semantics of flush_all such that it can now fail for e.g. open CDROM drives. Reintroduce bdrv_flush_all to regain the old semantics without having to alter the behavior of blk_flush or blk_flush_all, which are already 'doing the right thing.' Signed-off-by: John Snow Reviewed-by: Kevin Wolf Reviewed-by: Max Reitz Acked-by: Fam Zheng Signed-off-by: Kevin Wolf --- block/io.c| 25 + include/block/block.h | 1 + 2 files changed, 26 insertions(+) diff --git a/block/io.c b/block/io.c index fdf7080..57a2eeb 100644 --- a/block/io.c +++ b/block/io.c @@ -1619,6 +1619,31 @@ int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild *child, int64_t offset, BDRV_REQ_ZERO_WRITE | flags); } +/* + * Flush ALL BDSes regardless of if they are reachable via a BlkBackend or not. + */ +int bdrv_flush_all(void) +{ +BdrvNextIterator it; +BlockDriverState *bs = NULL; +int result = 0; + +for (bs = bdrv_first(); bs; bs = bdrv_next()) { +AioContext *aio_context = bdrv_get_aio_context(bs); +int ret; + +aio_context_acquire(aio_context); +ret = bdrv_flush(bs); +if (ret < 0 && !result) { +result = ret; +} +aio_context_release(aio_context); +} + +return result; +} + + typedef struct BdrvCoGetBlockStatusData { BlockDriverState *bs; BlockDriverState *base; diff --git a/include/block/block.h b/include/block/block.h index e18233a..811b060 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -333,6 +333,7 @@ int bdrv_inactivate_all(void); /* Ensure contents are flushed to disk. */ int bdrv_flush(BlockDriverState *bs); int coroutine_fn bdrv_co_flush(BlockDriverState *bs); +int bdrv_flush_all(void); void bdrv_close_all(void); void bdrv_drain(BlockDriverState *bs); void coroutine_fn bdrv_co_drain(BlockDriverState *bs); -- 1.8.3.1
[Qemu-block] [PULL 04/18] block: Fix error path in qmp_blockdev_change_medium()
Commit 00949bab incorrectly changed one instance of into errp while touching the line. Change it back. Signed-off-by: Kevin WolfReviewed-by: Max Reitz --- blockdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index 29c6561..62d0dd0 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2614,7 +2614,7 @@ void qmp_blockdev_change_medium(bool has_device, const char *device, error_free(err); err = NULL; -qmp_x_blockdev_remove_medium(has_device, device, has_id, id, errp); +qmp_x_blockdev_remove_medium(has_device, device, has_id, id, ); if (err) { error_propagate(errp, err); goto fail; -- 1.8.3.1
[Qemu-block] [PATCH 1/3] coroutine: add qemu_coroutine_entered() function
See the doc comments for a description of this new coroutine API. Signed-off-by: Stefan Hajnoczi--- include/qemu/coroutine.h | 13 + util/qemu-coroutine.c| 5 + 2 files changed, 18 insertions(+) diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 29a2078..e6a60d5 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -92,6 +92,19 @@ Coroutine *coroutine_fn qemu_coroutine_self(void); */ bool qemu_in_coroutine(void); +/** + * Return true if the coroutine is currently entered + * + * A coroutine is "entered" if it has not yielded from the current + * qemu_coroutine_enter() call used to run it. This does not mean that the + * coroutine is currently executing code since it may have transferred control + * to another coroutine using qemu_coroutine_enter(). + * + * When several coroutines enter each other there may be no way to know which + * ones have already been entered. In such situations this function can be + * used to avoid recursively entering coroutines. + */ +bool qemu_coroutine_entered(Coroutine *co); /** diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index 3cbf225..737bffa 100644 --- a/util/qemu-coroutine.c +++ b/util/qemu-coroutine.c @@ -146,3 +146,8 @@ void coroutine_fn qemu_coroutine_yield(void) self->caller = NULL; qemu_coroutine_switch(self, to, COROUTINE_YIELD); } + +bool qemu_coroutine_entered(Coroutine *co) +{ +return co->caller; +} -- 2.7.4
[Qemu-block] [PULL 10/18] block: Move 'discard' option to bdrv_open_common()
This enables its use for nested child nodes. The compatibility between the 'discard' and 'detect-zeroes' setting is checked in bdrv_open_common() now as the former setting isn't available before calling bdrv_open() any more. Signed-off-by: Kevin WolfReviewed-by: Eric Blake Reviewed-by: Max Reitz --- block.c | 17 - blockdev.c| 25 - include/block/block.h | 1 + 3 files changed, 17 insertions(+), 26 deletions(-) diff --git a/block.c b/block.c index 1f10457..bb1f1ec 100644 --- a/block.c +++ b/block.c @@ -765,7 +765,7 @@ static void bdrv_inherited_options(int *child_flags, QDict *child_options, /* Our block drivers take care to send flushes and respect unmap policy, * so we can default to enable both on lower layers regardless of the * corresponding parent options. */ -flags |= BDRV_O_UNMAP; +qdict_set_default_str(child_options, BDRV_OPT_DISCARD, "unmap"); /* Clear flags that only apply to the top layer */ flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_COPY_ON_READ | @@ -960,6 +960,11 @@ static QemuOptsList bdrv_runtime_opts = { .type = QEMU_OPT_STRING, .help = "try to optimize zero writes (off, on, unmap)", }, +{ +.name = "discard", +.type = QEMU_OPT_STRING, +.help = "discard operation (ignore/off, unmap/on)", +}, { /* end of list */ } }, }; @@ -976,6 +981,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file, const char *filename; const char *driver_name = NULL; const char *node_name = NULL; +const char *discard; const char *detect_zeroes; QemuOpts *opts; BlockDriver *drv; @@ -1045,6 +1051,15 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file, } } +discard = qemu_opt_get(opts, "discard"); +if (discard != NULL) { +if (bdrv_parse_discard_flags(discard, >open_flags) != 0) { +error_setg(errp, "Invalid discard option"); +ret = -EINVAL; +goto fail_opts; +} +} + detect_zeroes = qemu_opt_get(opts, "detect-zeroes"); if (detect_zeroes) { BlockdevDetectZeroesOptions value = diff --git a/blockdev.c b/blockdev.c index 7b87bd8..e2ace04 100644 --- a/blockdev.c +++ b/blockdev.c @@ -356,7 +356,6 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags, const char **throttling_group, ThrottleConfig *throttle_cfg, BlockdevDetectZeroesOptions *detect_zeroes, Error **errp) { -const char *discard; Error *local_error = NULL; const char *aio; @@ -365,13 +364,6 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags, *bdrv_flags |= BDRV_O_COPY_ON_READ; } -if ((discard = qemu_opt_get(opts, "discard")) != NULL) { -if (bdrv_parse_discard_flags(discard, bdrv_flags) != 0) { -error_setg(errp, "Invalid discard option"); -return; -} -} - if ((aio = qemu_opt_get(opts, "aio")) != NULL) { if (!strcmp(aio, "native")) { *bdrv_flags |= BDRV_O_NATIVE_AIO; @@ -449,15 +441,6 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags, error_propagate(errp, local_error); return; } - -if (bdrv_flags && -*detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP && -!(*bdrv_flags & BDRV_O_UNMAP)) -{ -error_setg(errp, "setting detect-zeroes to unmap is not allowed " - "without setting discard operation to unmap"); -return; -} } } @@ -3990,10 +3973,6 @@ QemuOptsList qemu_common_drive_opts = { .type = QEMU_OPT_BOOL, .help = "enable/disable snapshot mode", },{ -.name = "discard", -.type = QEMU_OPT_STRING, -.help = "discard operation (ignore/off, unmap/on)", -},{ .name = "aio", .type = QEMU_OPT_STRING, .help = "host AIO implementation (threads, native)", @@ -4125,10 +4104,6 @@ static QemuOptsList qemu_root_bds_opts = { .head = QTAILQ_HEAD_INITIALIZER(qemu_root_bds_opts.head), .desc = { { -.name = "discard", -.type = QEMU_OPT_STRING, -.help = "discard operation (ignore/off, unmap/on)", -},{ .name = "aio", .type = QEMU_OPT_STRING, .help = "host AIO implementation (threads, native)", diff --git a/include/block/block.h b/include/block/block.h index 811b060..107c603 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -108,6 +108,7 @@ typedef struct HDGeometry { #define BDRV_OPT_CACHE_DIRECT "cache.direct" #define
Re: [Qemu-block] [PATCH v5 7/9] block: don't make snapshots for filters
> From: Kevin Wolf [mailto:kw...@redhat.com] > Am 26.09.2016 um 11:51 hat Pavel Dovgalyuk geschrieben: > > > From: Kevin Wolf [mailto:kw...@redhat.com] > > > Am 26.09.2016 um 10:08 hat Pavel Dovgalyuk geschrieben: > > > > This patch disables snapshotting for block driver filters. > > > > It is needed, because snapshots should be created > > > > in underlying disk images, not in filters itself. > > > > > > > > Signed-off-by: Pavel Dovgalyuk> > > > > > But that's exactly what the existing code implements? If a driver > > > doesn't provide .bdrv_snapshot_goto, the request is redirected to > > > bs->file. > > > > > > > block/snapshot.c |3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/block/snapshot.c b/block/snapshot.c > > > > index bf5c2ca..8998b8b 100644 > > > > --- a/block/snapshot.c > > > > +++ b/block/snapshot.c > > > > @@ -184,6 +184,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs, > > > > if (!drv) { > > > > return -ENOMEDIUM; > > > > } > > > > +if (drv->is_filter) { > > > > +return 0; > > > > +} > > > > > > This, on the other hand, doesn't redirect the request, but silently > > > ignores it. That is, loading the snapshot will apparently succeed, but > > > it wouldn't actually load anything and the disk would stay in its > > > current state. > > > > In my use case bdrv_all_goto_snapshot iterates all block drivers, including > > filters and disk images. Therefore skipping goto for images is ok. > > Hm, this can happy today indeed. > > Originally, we only called bdrv_goto_snapshot() for all _top level_ > BDSes, and this is still what you normally get. However, if you > explicitly create a BDS (e.g. with its own -drive option), it is > considered a top level BDS without actually being top level for the > guest, and therefore the snapshotting function is called for it. > > Of course, this is highly inefficient because the goto_snapshot request > is passed by the filter driver and then called another time for the > lower node, effectively loading the snapshot a second time. > > On the other hand if you use a single -drive option to create both the > qcow2 BDS and the blkreplay filter, we do need to pass down the > goto_snapshot request because it won't be called for the qcow2 layer > otherwise. How this can be specified in command line? I believed that separate -drive option is required. > > I'm not completely sure yet what the right behaviour would be here. Pavel Dovgalyuk
[Qemu-block] [PATCH 0/3] linux-aio: fix "Co-routine re-entered recursively" error
It's possible to hit the "Co-routine re-entered recursively" error with -drive format=qcow2,aio=native. This is a regression introduced by a linux-aio.c optimization. See Patch 3 for details. Patches 1 & 2 add a new coroutine API called qemu_coroutine_entered() for checking whether a coroutine is currently "entered". This makes it possible to avoid re-entering coroutines recursively. Stefan Hajnoczi (3): coroutine: add qemu_coroutine_entered() function test-coroutine: test qemu_coroutine_entered() linux-aio: fix re-entrant completion processing block/linux-aio.c| 9 ++--- include/qemu/coroutine.h | 13 + tests/test-coroutine.c | 42 ++ util/qemu-coroutine.c| 5 + 4 files changed, 66 insertions(+), 3 deletions(-) -- 2.7.4
[Qemu-block] [PULL 03/18] block-backend: remove blk_flush_all
From: John SnowWe can teach Xen to drain and flush each device as it needs to, instead of trying to flush ALL devices. This removes the last user of blk_flush_all. The function is therefore removed under the premise that any new uses of blk_flush_all would be the wrong paradigm: either flush the single device that requires flushing, or use an appropriate flush_all mechanism from outside of the BlkBackend layer. Signed-off-by: John Snow Reviewed-by: Max Reitz Acked-by: Fam Zheng Signed-off-by: Kevin Wolf --- block/block-backend.c | 22 -- hw/i386/xen/xen_platform.c | 2 -- hw/ide/piix.c | 4 include/sysemu/block-backend.h | 1 - 4 files changed, 4 insertions(+), 25 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index 0bd19ab..f34bad5 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1640,28 +1640,6 @@ int blk_commit_all(void) return 0; } -int blk_flush_all(void) -{ -BlockBackend *blk = NULL; -int result = 0; - -while ((blk = blk_all_next(blk)) != NULL) { -AioContext *aio_context = blk_get_aio_context(blk); -int ret; - -aio_context_acquire(aio_context); -if (blk_is_inserted(blk)) { -ret = blk_flush(blk); -if (ret < 0 && !result) { -result = ret; -} -} -aio_context_release(aio_context); -} - -return result; -} - /* throttling disk I/O limits */ void blk_set_io_limits(BlockBackend *blk, ThrottleConfig *cfg) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index aa78393..f85635c 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -134,8 +134,6 @@ static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t v devices, and bit 2 the non-primary-master IDE devices. */ if (val & UNPLUG_ALL_IDE_DISKS) { DPRINTF("unplug disks\n"); -blk_drain_all(); -blk_flush_all(); pci_unplug_disks(pci_dev->bus); } if (val & UNPLUG_ALL_NICS) { diff --git a/hw/ide/piix.c b/hw/ide/piix.c index c190fca..d5777fd 100644 --- a/hw/ide/piix.c +++ b/hw/ide/piix.c @@ -179,6 +179,10 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev) if (di != NULL && !di->media_cd) { BlockBackend *blk = blk_by_legacy_dinfo(di); DeviceState *ds = blk_get_attached_dev(blk); + +blk_drain(blk); +blk_flush(blk); + if (ds) { blk_detach_dev(blk, ds); } diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 3b29317..24d1d85 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -152,7 +152,6 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count); int blk_co_flush(BlockBackend *blk); int blk_flush(BlockBackend *blk); -int blk_flush_all(void); int blk_commit_all(void); void blk_drain(BlockBackend *blk); void blk_drain_all(void); -- 1.8.3.1
[Qemu-block] [PULL 15/18] coroutine-ucontext: use helper for allocating stack memory
From: Peter LievenSigned-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- util/coroutine-ucontext.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c index 31254ab..6621f3f 100644 --- a/util/coroutine-ucontext.c +++ b/util/coroutine-ucontext.c @@ -34,6 +34,7 @@ typedef struct { Coroutine base; void *stack; +size_t stack_size; sigjmp_buf env; #ifdef CONFIG_VALGRIND_H @@ -82,7 +83,6 @@ static void coroutine_trampoline(int i0, int i1) Coroutine *qemu_coroutine_new(void) { -const size_t stack_size = COROUTINE_STACK_SIZE; CoroutineUContext *co; ucontext_t old_uc, uc; sigjmp_buf old_env; @@ -101,17 +101,18 @@ Coroutine *qemu_coroutine_new(void) } co = g_malloc0(sizeof(*co)); -co->stack = g_malloc(stack_size); +co->stack_size = COROUTINE_STACK_SIZE; +co->stack = qemu_alloc_stack(>stack_size); co->base.entry_arg = _env; /* stash away our jmp_buf */ uc.uc_link = _uc; uc.uc_stack.ss_sp = co->stack; -uc.uc_stack.ss_size = stack_size; +uc.uc_stack.ss_size = co->stack_size; uc.uc_stack.ss_flags = 0; #ifdef CONFIG_VALGRIND_H co->valgrind_stack_id = -VALGRIND_STACK_REGISTER(co->stack, co->stack + stack_size); +VALGRIND_STACK_REGISTER(co->stack, co->stack + co->stack_size); #endif arg.p = co; @@ -149,7 +150,7 @@ void qemu_coroutine_delete(Coroutine *co_) valgrind_stack_deregister(co); #endif -g_free(co->stack); +qemu_free_stack(co->stack, co->stack_size); g_free(co); } -- 1.8.3.1
[Qemu-block] [PATCH 2/3] test-coroutine: test qemu_coroutine_entered()
Signed-off-by: Stefan Hajnoczi--- tests/test-coroutine.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c index 6431dd6..abd97c2 100644 --- a/tests/test-coroutine.c +++ b/tests/test-coroutine.c @@ -53,6 +53,47 @@ static void test_self(void) } /* + * Check that qemu_coroutine_entered() works + */ + +static void coroutine_fn verify_entered_step_2(void *opaque) +{ +Coroutine *caller = (Coroutine *)opaque; + +g_assert(qemu_coroutine_entered(caller)); +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); +qemu_coroutine_yield(); + +/* Once more to check it still works after yielding */ +g_assert(qemu_coroutine_entered(caller)); +g_assert(qemu_coroutine_entered(qemu_coroutine_self())); +qemu_coroutine_yield(); +} + +static void coroutine_fn verify_entered_step_1(void *opaque) +{ +Coroutine *self = qemu_coroutine_self(); +Coroutine *coroutine; + +g_assert(qemu_coroutine_entered(self)); + +coroutine = qemu_coroutine_create(verify_entered_step_2, self); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +} + +static void test_entered(void) +{ +Coroutine *coroutine; + +coroutine = qemu_coroutine_create(verify_entered_step_1, NULL); +g_assert(!qemu_coroutine_entered(coroutine)); +qemu_coroutine_enter(coroutine); +} + +/* * Check that coroutines may nest multiple levels */ @@ -389,6 +430,7 @@ int main(int argc, char **argv) g_test_add_func("/basic/yield", test_yield); g_test_add_func("/basic/nesting", test_nesting); g_test_add_func("/basic/self", test_self); +g_test_add_func("/basic/entered", test_entered); g_test_add_func("/basic/in_coroutine", test_in_coroutine); g_test_add_func("/basic/order", test_order); if (g_test_perf()) { -- 2.7.4
[Qemu-block] [PULL 16/18] coroutine-sigaltstack: use helper for allocating stack memory
From: Peter LievenSigned-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- util/coroutine-sigaltstack.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c index a5bcb7e..f6fc49a 100644 --- a/util/coroutine-sigaltstack.c +++ b/util/coroutine-sigaltstack.c @@ -33,6 +33,7 @@ typedef struct { Coroutine base; void *stack; +size_t stack_size; sigjmp_buf env; } CoroutineSigAltStack; @@ -143,7 +144,6 @@ static void coroutine_trampoline(int signal) Coroutine *qemu_coroutine_new(void) { -const size_t stack_size = COROUTINE_STACK_SIZE; CoroutineSigAltStack *co; CoroutineThreadState *coTS; struct sigaction sa; @@ -164,7 +164,8 @@ Coroutine *qemu_coroutine_new(void) */ co = g_malloc0(sizeof(*co)); -co->stack = g_malloc(stack_size); +co->stack_size = COROUTINE_STACK_SIZE; +co->stack = qemu_alloc_stack(>stack_size); co->base.entry_arg = _env; /* stash away our jmp_buf */ coTS = coroutine_get_thread_state(); @@ -189,7 +190,7 @@ Coroutine *qemu_coroutine_new(void) * Set the new stack. */ ss.ss_sp = co->stack; -ss.ss_size = stack_size; +ss.ss_size = co->stack_size; ss.ss_flags = 0; if (sigaltstack(, ) < 0) { abort(); @@ -253,7 +254,7 @@ void qemu_coroutine_delete(Coroutine *co_) { CoroutineSigAltStack *co = DO_UPCAST(CoroutineSigAltStack, base, co_); -g_free(co->stack); +qemu_free_stack(co->stack, co->stack_size); g_free(co); } -- 1.8.3.1
[Qemu-block] [PULL 17/18] oslib-posix: add a configure switch to debug stack usage
From: Peter Lieventhis adds a knob to track the maximum stack usage of stacks created by qemu_alloc_stack. Signed-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- configure | 19 +++ util/oslib-posix.c | 35 +++ 2 files changed, 54 insertions(+) diff --git a/configure b/configure index 8fa62ad..93ef00a 100755 --- a/configure +++ b/configure @@ -296,6 +296,7 @@ libiscsi="" libnfs="" coroutine="" coroutine_pool="" +debug_stack_usage="no" seccomp="" glusterfs="" glusterfs_xlator_opt="no" @@ -1004,6 +1005,8 @@ for opt do ;; --enable-coroutine-pool) coroutine_pool="yes" ;; + --enable-debug-stack-usage) debug_stack_usage="yes" + ;; --disable-docs) docs="no" ;; --enable-docs) docs="yes" @@ -4276,6 +4279,17 @@ if test "$coroutine" = "gthread" -a "$coroutine_pool" = "yes"; then error_exit "'gthread' coroutine backend does not support pool (use --disable-coroutine-pool)" fi +if test "$debug_stack_usage" = "yes"; then + if test "$cpu" = "ia64" -o "$cpu" = "hppa"; then +error_exit "stack usage debugging is not supported for $cpu" + fi + if test "$coroutine_pool" = "yes"; then +echo "WARN: disabling coroutine pool for stack usage debugging" +coroutine_pool=no + fi +fi + + ## # check if we have open_by_handle_at @@ -4861,6 +4875,7 @@ echo "QGA MSI support $guest_agent_msi" echo "seccomp support $seccomp" echo "coroutine backend $coroutine" echo "coroutine pool$coroutine_pool" +echo "debug stack usage $debug_stack_usage" echo "GlusterFS support $glusterfs" echo "Archipelago support $archipelago" echo "gcov $gcov_tool" @@ -5330,6 +5345,10 @@ else echo "CONFIG_COROUTINE_POOL=0" >> $config_host_mak fi +if test "$debug_stack_usage" = "yes" ; then + echo "CONFIG_DEBUG_STACK_USAGE=y" >> $config_host_mak +fi + if test "$open_by_handle_at" = "yes" ; then echo "CONFIG_OPEN_BY_HANDLE=y" >> $config_host_mak fi diff --git a/util/oslib-posix.c b/util/oslib-posix.c index d950c34..aaec189 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -50,6 +50,10 @@ #include "qemu/mmap-alloc.h" +#ifdef CONFIG_DEBUG_STACK_USAGE +#include "qemu/error-report.h" +#endif + int qemu_get_thread_id(void) { #if defined(__linux__) @@ -503,6 +507,9 @@ pid_t qemu_fork(Error **errp) void *qemu_alloc_stack(size_t *sz) { void *ptr, *guardpage; +#ifdef CONFIG_DEBUG_STACK_USAGE +void *ptr2; +#endif size_t pagesz = getpagesize(); #ifdef _SC_THREAD_STACK_MIN /* avoid stacks smaller than _SC_THREAD_STACK_MIN */ @@ -534,10 +541,38 @@ void *qemu_alloc_stack(size_t *sz) abort(); } +#ifdef CONFIG_DEBUG_STACK_USAGE +for (ptr2 = ptr + pagesz; ptr2 < ptr + *sz; ptr2 += sizeof(uint32_t)) { +*(uint32_t *)ptr2 = 0xdeadbeaf; +} +#endif + return ptr; } +#ifdef CONFIG_DEBUG_STACK_USAGE +static __thread unsigned int max_stack_usage; +#endif + void qemu_free_stack(void *stack, size_t sz) { +#ifdef CONFIG_DEBUG_STACK_USAGE +unsigned int usage; +void *ptr; + +for (ptr = stack + getpagesize(); ptr < stack + sz; + ptr += sizeof(uint32_t)) { +if (*(uint32_t *)ptr != 0xdeadbeaf) { +break; +} +} +usage = sz - (uintptr_t) (ptr - stack); +if (usage > max_stack_usage) { +error_report("thread %d max stack usage increased from %u to %u", + qemu_get_thread_id(), max_stack_usage, usage); +max_stack_usage = usage; +} +#endif + munmap(stack, sz); } -- 1.8.3.1
[Qemu-block] [PULL 06/18] block/qapi: Use separate options type for curl driver
We're going to add an option to the file drivers which doesn't apply to the curl drivers, so give them a separate option type. Signed-off-by: Kevin WolfReviewed-by: Eric Blake Reviewed-by: Max Reitz --- qapi/block-core.json | 25 ++--- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 92193ab..b5fdd42 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1721,8 +1721,7 @@ ## # @BlockdevOptionsFile # -# Driver specific block device options for the file backend and similar -# protocols. +# Driver specific block device options for the file backend. # # @filename:path to the image file # @@ -2211,6 +2210,18 @@ '*top-id': 'str' } } ## +# @BlockdevOptionsCurl +# +# Driver specific block device options for the curl backend. +# +# @filename:path to the image file +# +# Since: 1.7 +## +{ 'struct': 'BlockdevOptionsCurl', + 'data': { 'filename': 'str' } } + +## # @BlockdevOptions # # Options for creating a block device. Many options are available for all @@ -2248,13 +2259,13 @@ 'cloop': 'BlockdevOptionsGenericFormat', 'dmg':'BlockdevOptionsGenericFormat', 'file': 'BlockdevOptionsFile', - 'ftp':'BlockdevOptionsFile', - 'ftps': 'BlockdevOptionsFile', + 'ftp':'BlockdevOptionsCurl', + 'ftps': 'BlockdevOptionsCurl', 'gluster':'BlockdevOptionsGluster', 'host_cdrom': 'BlockdevOptionsFile', 'host_device':'BlockdevOptionsFile', - 'http': 'BlockdevOptionsFile', - 'https': 'BlockdevOptionsFile', + 'http': 'BlockdevOptionsCurl', + 'https': 'BlockdevOptionsCurl', # TODO iscsi: Wait for structured options 'luks': 'BlockdevOptionsLUKS', # TODO nbd: Should take InetSocketAddress for 'host'? @@ -2271,7 +2282,7 @@ 'replication':'BlockdevOptionsReplication', # TODO sheepdog: Wait for structured options # TODO ssh: Should take InetSocketAddress for 'host'? - 'tftp': 'BlockdevOptionsFile', + 'tftp': 'BlockdevOptionsCurl', 'vdi':'BlockdevOptionsGenericFormat', 'vhdx': 'BlockdevOptionsGenericFormat', 'vmdk': 'BlockdevOptionsGenericCOWFormat', -- 1.8.3.1
[Qemu-block] [PULL 00/18] Block layer patches
The following changes since commit 7cfdc02dae0d2ff58c897496cfdbbafc0eda0f3f: Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2016-09-26 19:47:00 +0100) are available in the git repository at: git://repo.or.cz/qemu/kevin.git tags/for-upstream for you to fetch changes up to 3b856cebe5e93547852c156ca2119d075e62aed7: coroutine: reduce stack size to 60kB (2016-09-27 14:05:21 +0200) Block layer patches John Snow (3): block: reintroduce bdrv_flush_all qemu: use bdrv_flush_all for vm_stop et al block-backend: remove blk_flush_all Kevin Wolf (8): block: Fix error path in qmp_blockdev_change_medium() block: Drop aio/cache consistency check from qmp_blockdev_add() block/qapi: Use separate options type for curl driver block/qapi: Move 'aio' option to file driver block: Parse 'detect-zeroes' in bdrv_open_common() block: Use 'detect-zeroes' option for 'blockdev-change-medium' block: Move 'discard' option to bdrv_open_common() block: Remove qemu_root_bds_opts Peter Lieven (7): oslib-posix: add helpers for stack alloc and free coroutine-sigaltstack: rename coroutine struct appropriately coroutine: add a macro for the coroutine stack size coroutine-ucontext: use helper for allocating stack memory coroutine-sigaltstack: use helper for allocating stack memory oslib-posix: add a configure switch to debug stack usage coroutine: reduce stack size to 60kB block.c| 50 +- block/block-backend.c | 31 ++-- block/io.c | 25 + block/raw-posix.c | 44 +--- block/raw-win32.c | 56 +++-- blockdev.c | 112 +++-- configure | 19 +++ cpus.c | 4 +- hw/i386/xen/xen_platform.c | 2 - hw/ide/piix.c | 4 ++ include/block/block.h | 2 + include/qemu/coroutine_int.h | 2 + include/sysemu/block-backend.h | 3 +- include/sysemu/os-posix.h | 27 ++ qapi/block-core.json | 31 tests/qemu-iotests/087 | 4 +- tests/qemu-iotests/087.out | 2 +- util/coroutine-sigaltstack.c | 25 - util/coroutine-ucontext.c | 11 ++-- util/coroutine-win32.c | 2 +- util/oslib-posix.c | 77 21 files changed, 342 insertions(+), 191 deletions(-)
[Qemu-block] [PULL 05/18] block: Drop aio/cache consistency check from qmp_blockdev_add()
The TODO comment has been addressed a while ago and this is now checked in raw-posix, so we don't have to special case this in blockdev-add any more. Signed-off-by: Kevin WolfReviewed-by: Eric Blake Reviewed-by: Max Reitz --- blockdev.c | 15 --- tests/qemu-iotests/087.out | 2 +- 2 files changed, 1 insertion(+), 16 deletions(-) diff --git a/blockdev.c b/blockdev.c index 62d0dd0..7820f42 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3832,21 +3832,6 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp) QDict *qdict; Error *local_err = NULL; -/* TODO Sort it out in raw-posix and drive_new(): Reject aio=native with - * cache.direct=false instead of silently switching to aio=threads, except - * when called from drive_new(). - * - * For now, simply forbidding the combination for all drivers will do. */ -if (options->has_aio && options->aio == BLOCKDEV_AIO_OPTIONS_NATIVE) { -bool direct = options->has_cache && - options->cache->has_direct && - options->cache->direct; -if (!direct) { -error_setg(errp, "aio=native requires cache.direct=true"); -goto fail; -} -} - visit_type_BlockdevOptions(v, NULL, , _err); if (local_err) { error_propagate(errp, local_err); diff --git a/tests/qemu-iotests/087.out b/tests/qemu-iotests/087.out index bef6862..cd02eae 100644 --- a/tests/qemu-iotests/087.out +++ b/tests/qemu-iotests/087.out @@ -27,7 +27,7 @@ QMP_VERSION Testing: QMP_VERSION {"return": {}} -{"error": {"class": "GenericError", "desc": "aio=native requires cache.direct=true"}} +{"error": {"class": "GenericError", "desc": "aio=native was specified, but it requires cache.direct=on, which was not specified."}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN"} -- 1.8.3.1
[Qemu-block] [PULL 13/18] coroutine-sigaltstack: rename coroutine struct appropriately
From: Peter LievenThe name of the sigaltstack coroutine struct was misleading. Signed-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- util/coroutine-sigaltstack.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c index a7c3366..171cd44 100644 --- a/util/coroutine-sigaltstack.c +++ b/util/coroutine-sigaltstack.c @@ -34,7 +34,7 @@ typedef struct { Coroutine base; void *stack; sigjmp_buf env; -} CoroutineUContext; +} CoroutineSigAltStack; /** * Per-thread coroutine bookkeeping @@ -44,7 +44,7 @@ typedef struct { Coroutine *current; /** The default coroutine */ -CoroutineUContext leader; +CoroutineSigAltStack leader; /** Information for the signal handler (trampoline) */ sigjmp_buf tr_reenter; @@ -89,7 +89,7 @@ static void __attribute__((constructor)) coroutine_init(void) * (from the signal handler when it is not signal handling, read ahead * for more information). */ -static void coroutine_bootstrap(CoroutineUContext *self, Coroutine *co) +static void coroutine_bootstrap(CoroutineSigAltStack *self, Coroutine *co) { /* Initialize longjmp environment and switch back the caller */ if (!sigsetjmp(self->env, 0)) { @@ -109,7 +109,7 @@ static void coroutine_bootstrap(CoroutineUContext *self, Coroutine *co) */ static void coroutine_trampoline(int signal) { -CoroutineUContext *self; +CoroutineSigAltStack *self; Coroutine *co; CoroutineThreadState *coTS; @@ -144,7 +144,7 @@ static void coroutine_trampoline(int signal) Coroutine *qemu_coroutine_new(void) { const size_t stack_size = 1 << 20; -CoroutineUContext *co; +CoroutineSigAltStack *co; CoroutineThreadState *coTS; struct sigaction sa; struct sigaction osa; @@ -251,7 +251,7 @@ Coroutine *qemu_coroutine_new(void) void qemu_coroutine_delete(Coroutine *co_) { -CoroutineUContext *co = DO_UPCAST(CoroutineUContext, base, co_); +CoroutineSigAltStack *co = DO_UPCAST(CoroutineSigAltStack, base, co_); g_free(co->stack); g_free(co); @@ -260,8 +260,8 @@ void qemu_coroutine_delete(Coroutine *co_) CoroutineAction qemu_coroutine_switch(Coroutine *from_, Coroutine *to_, CoroutineAction action) { -CoroutineUContext *from = DO_UPCAST(CoroutineUContext, base, from_); -CoroutineUContext *to = DO_UPCAST(CoroutineUContext, base, to_); +CoroutineSigAltStack *from = DO_UPCAST(CoroutineSigAltStack, base, from_); +CoroutineSigAltStack *to = DO_UPCAST(CoroutineSigAltStack, base, to_); CoroutineThreadState *s = coroutine_get_thread_state(); int ret; -- 1.8.3.1
[Qemu-block] [PULL 18/18] coroutine: reduce stack size to 60kB
From: Peter Lievenevaluation with the recently introduced maximum stack usage monitoring revealed that the actual used stack size was never above 4kB so allocating 1MB stack for each coroutine is a lot of wasted memory. So reduce the stack size to 60kB which should still give enough head room. The guard page added in qemu_alloc_stack will catch a potential stack overflow introduced by this commit. The 60kB + guard page will result in an allocation of 64kB per coroutine on systems where a page is 4kB. Signed-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- include/qemu/coroutine_int.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h index 14d4f1d..be14260 100644 --- a/include/qemu/coroutine_int.h +++ b/include/qemu/coroutine_int.h @@ -28,7 +28,7 @@ #include "qemu/queue.h" #include "qemu/coroutine.h" -#define COROUTINE_STACK_SIZE (1 << 20) +#define COROUTINE_STACK_SIZE 61440 typedef enum { COROUTINE_YIELD = 1, -- 1.8.3.1
Re: [Qemu-block] [Qemu-devel] [PATCH v3] block: Turn on "unmap" in active commit
On 09/27/2016 06:14 AM, Fam Zheng wrote: > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > commit', but didn't turn on the "unmap" in the active commit job. This > patch fixes that so that zeroed clusters in top image can be discarded > which is desired in the virt-sparsify use case, where a temporary > overlay is created and fstrim'ed before commiting back, to free space in s/commiting/committing/ > the original image. > > This also enables it for block-commit. > > Signed-off-by: Fam Zheng> --- > v3: Change the right parameter. > v2: Add "unmap" to block-commit as well. [Kevin] > --- > block/mirror.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH v3] block: Turn on "unmap" in active commit
On 09/27/2016 06:14 AM, Fam Zheng wrote: > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > commit', but didn't turn on the "unmap" in the active commit job. This > patch fixes that so that zeroed clusters in top image can be discarded > which is desired in the virt-sparsify use case, where a temporary > overlay is created and fstrim'ed before commiting back, to free space in > the original image. > > This also enables it for block-commit. > > Signed-off-by: Fam Zheng> --- > v3: Change the right parameter. Doesn't affect this patch, but it may be worth using the 'boxed':true notation in the .json file to make it more compact to pass job information around via a struct rather than a large mess of parameters where you are likely to get the wrong one. > v2: Add "unmap" to block-commit as well. [Kevin] > --- > block/mirror.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/mirror.c b/block/mirror.c > index f9d1fec..8847ec5 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -1042,7 +1042,7 @@ void commit_active_start(const char *job_id, > BlockDriverState *bs, > > mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, > MIRROR_LEAVE_BACKING_CHAIN, > - on_error, on_error, false, cb, opaque, _err, > + on_error, on_error, true, cb, opaque, _err, > _active_job_driver, false, base, auto_complete); -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-block] [PULL 11/18] block: Remove qemu_root_bds_opts
The remaining options in qemu_root_bds_opts (aio and copy-on-read) aren't used any more, the QAPI schema doesn't contain them. Therefore all the code processing qemu_root_bds_opts options is dead and can be removed. Signed-off-by: Kevin WolfReviewed-by: Eric Blake Reviewed-by: Max Reitz --- blockdev.c | 54 +- 1 file changed, 1 insertion(+), 53 deletions(-) diff --git a/blockdev.c b/blockdev.c index e2ace04..07ec733 100644 --- a/blockdev.c +++ b/blockdev.c @@ -633,34 +633,11 @@ err_no_opts: return NULL; } -static QemuOptsList qemu_root_bds_opts; - /* Takes the ownership of bs_opts */ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp) { -BlockDriverState *bs; -QemuOpts *opts; -Error *local_error = NULL; int bdrv_flags = 0; -opts = qemu_opts_create(_root_bds_opts, NULL, 1, errp); -if (!opts) { -goto fail; -} - -qemu_opts_absorb_qdict(opts, bs_opts, _error); -if (local_error) { -error_propagate(errp, local_error); -goto fail; -} - -extract_common_blockdev_options(opts, _flags, NULL, NULL, -NULL, _error); -if (local_error) { -error_propagate(errp, local_error); -goto fail; -} - /* bdrv_open() defaults to the values in bdrv_flags (for compatibility * with other callers) rather than what we want as the real defaults. * Apply the defaults here instead. */ @@ -672,19 +649,7 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp) bdrv_flags |= BDRV_O_INACTIVE; } -bs = bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp); -if (!bs) { -goto fail_no_bs_opts; -} - -fail_no_bs_opts: -qemu_opts_del(opts); -return bs; - -fail: -qemu_opts_del(opts); -QDECREF(bs_opts); -return NULL; +return bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp); } void blockdev_close_all_bdrv_states(void) @@ -4099,23 +4064,6 @@ QemuOptsList qemu_common_drive_opts = { }, }; -static QemuOptsList qemu_root_bds_opts = { -.name = "root-bds", -.head = QTAILQ_HEAD_INITIALIZER(qemu_root_bds_opts.head), -.desc = { -{ -.name = "aio", -.type = QEMU_OPT_STRING, -.help = "host AIO implementation (threads, native)", -},{ -.name = "copy-on-read", -.type = QEMU_OPT_BOOL, -.help = "copy read data from backing file into image file", -}, -{ /* end of list */ } -}, -}; - QemuOptsList qemu_drive_opts = { .name = "drive", .head = QTAILQ_HEAD_INITIALIZER(qemu_drive_opts.head), -- 1.8.3.1
[Qemu-block] [PULL 08/18] block: Parse 'detect-zeroes' in bdrv_open_common()
Amongst others, this means that you can now use the 'detect-zeroes' option for non-top-level nodes in blockdev-add, like the QAPI schema promises. Signed-off-by: Kevin WolfReviewed-by: Eric Blake Reviewed-by: Max Reitz --- block.c| 33 + blockdev.c | 9 + 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/block.c b/block.c index 493ecf3..1f10457 100644 --- a/block.c +++ b/block.c @@ -42,6 +42,7 @@ #include "qapi-event.h" #include "qemu/cutils.h" #include "qemu/id.h" +#include "qapi/util.h" #ifdef CONFIG_BSD #include @@ -954,6 +955,11 @@ static QemuOptsList bdrv_runtime_opts = { .type = QEMU_OPT_BOOL, .help = "Node is opened in read-only mode", }, +{ +.name = "detect-zeroes", +.type = QEMU_OPT_STRING, +.help = "try to optimize zero writes (off, on, unmap)", +}, { /* end of list */ } }, }; @@ -970,6 +976,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file, const char *filename; const char *driver_name = NULL; const char *node_name = NULL; +const char *detect_zeroes; QemuOpts *opts; BlockDriver *drv; Error *local_err = NULL; @@ -1038,6 +1045,32 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file, } } +detect_zeroes = qemu_opt_get(opts, "detect-zeroes"); +if (detect_zeroes) { +BlockdevDetectZeroesOptions value = +qapi_enum_parse(BlockdevDetectZeroesOptions_lookup, +detect_zeroes, +BLOCKDEV_DETECT_ZEROES_OPTIONS__MAX, +BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF, +_err); +if (local_err) { +error_propagate(errp, local_err); +ret = -EINVAL; +goto fail_opts; +} + +if (value == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP && +!(bs->open_flags & BDRV_O_UNMAP)) +{ +error_setg(errp, "setting detect-zeroes to unmap is not allowed " + "without setting discard operation to unmap"); +ret = -EINVAL; +goto fail_opts; +} + +bs->detect_zeroes = value; +} + if (filename != NULL) { pstrcpy(bs->filename, sizeof(bs->filename), filename); } else { diff --git a/blockdev.c b/blockdev.c index 7820f42..511260c 100644 --- a/blockdev.c +++ b/blockdev.c @@ -658,7 +658,6 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp) BlockDriverState *bs; QemuOpts *opts; Error *local_error = NULL; -BlockdevDetectZeroesOptions detect_zeroes; int bdrv_flags = 0; opts = qemu_opts_create(_root_bds_opts, NULL, 1, errp); @@ -673,7 +672,7 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp) } extract_common_blockdev_options(opts, _flags, NULL, NULL, -_zeroes, _error); +NULL, _error); if (local_error) { error_propagate(errp, local_error); goto fail; @@ -695,8 +694,6 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp) goto fail_no_bs_opts; } -bs->detect_zeroes = detect_zeroes; - fail_no_bs_opts: qemu_opts_del(opts); return bs; @@ -4136,10 +4133,6 @@ static QemuOptsList qemu_root_bds_opts = { .name = "copy-on-read", .type = QEMU_OPT_BOOL, .help = "copy read data from backing file into image file", -},{ -.name = "detect-zeroes", -.type = QEMU_OPT_STRING, -.help = "try to optimize zero writes (off, on, unmap)", }, { /* end of list */ } }, -- 1.8.3.1
[Qemu-block] [PULL 14/18] coroutine: add a macro for the coroutine stack size
From: Peter LievenSigned-off-by: Peter Lieven Reviewed-by: Paolo Bonzini Reviewed-by: Richard Henderson Signed-off-by: Kevin Wolf --- include/qemu/coroutine_int.h | 2 ++ util/coroutine-sigaltstack.c | 2 +- util/coroutine-ucontext.c| 2 +- util/coroutine-win32.c | 2 +- 4 files changed, 5 insertions(+), 3 deletions(-) diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h index 6df9d33..14d4f1d 100644 --- a/include/qemu/coroutine_int.h +++ b/include/qemu/coroutine_int.h @@ -28,6 +28,8 @@ #include "qemu/queue.h" #include "qemu/coroutine.h" +#define COROUTINE_STACK_SIZE (1 << 20) + typedef enum { COROUTINE_YIELD = 1, COROUTINE_TERMINATE = 2, diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c index 171cd44..a5bcb7e 100644 --- a/util/coroutine-sigaltstack.c +++ b/util/coroutine-sigaltstack.c @@ -143,7 +143,7 @@ static void coroutine_trampoline(int signal) Coroutine *qemu_coroutine_new(void) { -const size_t stack_size = 1 << 20; +const size_t stack_size = COROUTINE_STACK_SIZE; CoroutineSigAltStack *co; CoroutineThreadState *coTS; struct sigaction sa; diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c index 2bb7e10..31254ab 100644 --- a/util/coroutine-ucontext.c +++ b/util/coroutine-ucontext.c @@ -82,7 +82,7 @@ static void coroutine_trampoline(int i0, int i1) Coroutine *qemu_coroutine_new(void) { -const size_t stack_size = 1 << 20; +const size_t stack_size = COROUTINE_STACK_SIZE; CoroutineUContext *co; ucontext_t old_uc, uc; sigjmp_buf old_env; diff --git a/util/coroutine-win32.c b/util/coroutine-win32.c index 02e28e8..de6bd4f 100644 --- a/util/coroutine-win32.c +++ b/util/coroutine-win32.c @@ -71,7 +71,7 @@ static void CALLBACK coroutine_trampoline(void *co_) Coroutine *qemu_coroutine_new(void) { -const size_t stack_size = 1 << 20; +const size_t stack_size = COROUTINE_STACK_SIZE; CoroutineWin32 *co; co = g_malloc0(sizeof(*co)); -- 1.8.3.1
[Qemu-block] [PULL 02/18] qemu: use bdrv_flush_all for vm_stop et al
From: John SnowReimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all, bdrv_flush_all does not have device model restrictions. This allows us to flush and halt unconditionally without error. This allows us to do things like migrate when we have a device with an open tray, but has a node that may need to be flushed, or nodes that aren't currently attached to any device and need to be flushed. Specifically, this allows us to migrate when we have a CDROM with an open tray. Signed-off-by: John Snow Reviewed-by: Kevin Wolf Reviewed-by: Max Reitz Acked-by: Fam Zheng Signed-off-by: Kevin Wolf --- cpus.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cpus.c b/cpus.c index e39ccb7..96d9352 100644 --- a/cpus.c +++ b/cpus.c @@ -751,7 +751,7 @@ static int do_vm_stop(RunState state) } bdrv_drain_all(); -ret = blk_flush_all(); +ret = bdrv_flush_all(); return ret; } @@ -1494,7 +1494,7 @@ int vm_stop_force_state(RunState state) bdrv_drain_all(); /* Make sure to return an error if the flush in a previous vm_stop() * failed. */ -return blk_flush_all(); +return bdrv_flush_all(); } } -- 1.8.3.1
[Qemu-block] [PULL 12/18] oslib-posix: add helpers for stack alloc and free
From: Peter Lieventhe allocated stack will be adjusted to the minimum supported stack size by the OS and rounded up to be a multiple of the system pagesize. Additionally an architecture dependent guard page is added to the stack to catch stack overflows. Signed-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- include/sysemu/os-posix.h | 27 +++ util/oslib-posix.c| 42 ++ 2 files changed, 69 insertions(+) diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h index 9c7dfdf..3cfedbc 100644 --- a/include/sysemu/os-posix.h +++ b/include/sysemu/os-posix.h @@ -60,4 +60,31 @@ int qemu_utimens(const char *path, const qemu_timespec *times); bool is_daemonized(void); +/** + * qemu_alloc_stack: + * @sz: pointer to a size_t holding the requested usable stack size + * + * Allocate memory that can be used as a stack, for instance for + * coroutines. If the memory cannot be allocated, this function + * will abort (like g_malloc()). This function also inserts an + * additional guard page to catch a potential stack overflow. + * Note that the memory required for the guard page and alignment + * and minimal stack size restrictions will increase the value of sz. + * + * The allocated stack must be freed with qemu_free_stack(). + * + * Returns: pointer to (the lowest address of) the stack memory. + */ +void *qemu_alloc_stack(size_t *sz); + +/** + * qemu_free_stack: + * @stack: stack to free + * @sz: size of stack in bytes + * + * Free a stack allocated via qemu_alloc_stack(). Note that sz must + * be exactly the adjusted stack size returned by qemu_alloc_stack. + */ +void qemu_free_stack(void *stack, size_t sz); + #endif diff --git a/util/oslib-posix.c b/util/oslib-posix.c index f2d4e9e..d950c34 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -499,3 +499,45 @@ pid_t qemu_fork(Error **errp) } return pid; } + +void *qemu_alloc_stack(size_t *sz) +{ +void *ptr, *guardpage; +size_t pagesz = getpagesize(); +#ifdef _SC_THREAD_STACK_MIN +/* avoid stacks smaller than _SC_THREAD_STACK_MIN */ +long min_stack_sz = sysconf(_SC_THREAD_STACK_MIN); +*sz = MAX(MAX(min_stack_sz, 0), *sz); +#endif +/* adjust stack size to a multiple of the page size */ +*sz = ROUND_UP(*sz, pagesz); +/* allocate one extra page for the guard page */ +*sz += pagesz; + +ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); +if (ptr == MAP_FAILED) { +abort(); +} + +#if defined(HOST_IA64) +/* separate register stack */ +guardpage = ptr + (((*sz - pagesz) / 2) & ~pagesz); +#elif defined(HOST_HPPA) +/* stack grows up */ +guardpage = ptr + *sz - pagesz; +#else +/* stack grows down */ +guardpage = ptr; +#endif +if (mprotect(guardpage, pagesz, PROT_NONE) != 0) { +abort(); +} + +return ptr; +} + +void qemu_free_stack(void *stack, size_t sz) +{ +munmap(stack, sz); +} -- 1.8.3.1
[Qemu-block] [PATCH v14 19/19] qapi: delete unused OptsVisitor code
Now that all code has been converted to QObjectInputVisitor's QemuOpts compatibility mode, there is no longer any reason to keep OptsVisitor alive. Signed-off-by: Daniel P. Berrange--- include/qapi/opts-visitor.h | 40 qapi/Makefile.objs | 2 +- qapi/opts-visitor.c | 544 tests/Makefile.include | 5 +- tests/test-opts-visitor.c | 268 -- vl.c| 1 - 6 files changed, 2 insertions(+), 858 deletions(-) delete mode 100644 include/qapi/opts-visitor.h delete mode 100644 qapi/opts-visitor.c delete mode 100644 tests/test-opts-visitor.c diff --git a/include/qapi/opts-visitor.h b/include/qapi/opts-visitor.h deleted file mode 100644 index 6462c96..000 --- a/include/qapi/opts-visitor.h +++ /dev/null @@ -1,40 +0,0 @@ -/* - * Options Visitor - * - * Copyright Red Hat, Inc. 2012 - * - * Author: Laszlo Ersek - * - * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. - * See the COPYING.LIB file in the top-level directory. - * - */ - -#ifndef OPTS_VISITOR_H -#define OPTS_VISITOR_H - -#include "qapi/visitor.h" -#include "qemu/option.h" - -/* Inclusive upper bound on the size of any flattened range. This is a safety - * (= anti-annoyance) measure; wrong ranges should not cause long startup - * delays nor exhaust virtual memory. - */ -#define OPTS_VISITOR_RANGE_MAX 65536 - -typedef struct OptsVisitor OptsVisitor; - -/* Contrarily to qemu-option.c::parse_option_number(), OptsVisitor's "int" - * parser relies on strtoll() instead of strtoull(). Consequences: - * - string representations of negative numbers yield negative values, - * - values below INT64_MIN or LLONG_MIN are rejected, - * - values above INT64_MAX or LLONG_MAX are rejected. - * - * The Opts input visitor does not implement support for visiting QAPI - * alternates, numbers (other than integers), null, or arbitrary - * QTypes. It also requires a non-null list argument to - * visit_start_list(). - */ -Visitor *opts_visitor_new(const QemuOpts *opts); - -#endif diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs index 33906ff..4b820a7 100644 --- a/qapi/Makefile.objs +++ b/qapi/Makefile.objs @@ -1,6 +1,6 @@ util-obj-y = qapi-visit-core.o qapi-dealloc-visitor.o qobject-input-visitor.o util-obj-y += qobject-output-visitor.o qmp-registry.o qmp-dispatch.o util-obj-y += string-input-visitor.o string-output-visitor.o -util-obj-y += opts-visitor.o qapi-clone-visitor.o +util-obj-y += qapi-clone-visitor.o util-obj-y += qmp-event.o util-obj-y += qapi-util.o diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c deleted file mode 100644 index 084f7cc..000 --- a/qapi/opts-visitor.c +++ /dev/null @@ -1,544 +0,0 @@ -/* - * Options Visitor - * - * Copyright Red Hat, Inc. 2012-2016 - * - * Author: Laszlo Ersek - * - * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. - * See the COPYING.LIB file in the top-level directory. - * - */ - -#include "qemu/osdep.h" -#include "qapi/error.h" -#include "qemu/cutils.h" -#include "qapi/qmp/qerror.h" -#include "qapi/opts-visitor.h" -#include "qemu/queue.h" -#include "qemu/option_int.h" -#include "qapi/visitor-impl.h" - - -enum ListMode -{ -LM_NONE, /* not traversing a list of repeated options */ - -LM_IN_PROGRESS, /* opts_next_list() ready to be called. - * - * Generating the next list link will consume the most - * recently parsed QemuOpt instance of the repeated - * option. - * - * Parsing a value into the list link will examine the - * next QemuOpt instance of the repeated option, and - * possibly enter LM_SIGNED_INTERVAL or - * LM_UNSIGNED_INTERVAL. - */ - -LM_SIGNED_INTERVAL, /* opts_next_list() has been called. - * - * Generating the next list link will consume the most - * recently stored element from the signed interval, - * parsed from the most recent QemuOpt instance of the - * repeated option. This may consume QemuOpt itself - * and return to LM_IN_PROGRESS. - * - * Parsing a value into the list link will store the - * next element of the signed interval. - */ - -LM_UNSIGNED_INTERVAL /* Same as above, only for an unsigned interval. */ -}; - -typedef enum ListMode ListMode; - -struct OptsVisitor -{ -Visitor visitor; - -/* Ownership remains with opts_visitor_new()'s caller. */ -const QemuOpts *opts_root; - -unsigned depth; - -/* Non-null
[Qemu-block] [PATCH v14 13/19] qom: support non-scalar properties with -object
The current -object command line syntax only allows for creation of objects with scalar properties, or a list with a fixed scalar element type. Objects which have properties that are represented as structs in the QAPI schema cannot be created using -object. This is a design limitation of the way the OptsVisitor is written. It simply iterates over the QemuOpts values as a flat list. The support for lists is enabled by allowing the same key to be repeated in the opts string. The QObjectInputVisitor now has functionality that allows it to replace OptsVisitor, maintaining full backwards compatibility for previous CLI syntax, while also allowing use of new syntax for structs. Thus -object can now support non-scalar properties, for example the QMP object: { "execute": "object-add", "arguments": { "qom-type": "demo", "id": "demo0", "parameters": { "foo": [ { "bar": "one", "wizz": "1" }, { "bar": "two", "wizz": "2" } ] } } } Would be creatable via the CLI now using $QEMU \ -object demo,id=demo0,\ foo.0.bar=one,foo.0.wizz=1,\ foo.1.bar=two,foo.1.wizz=2 Notice that this syntax is intentionally compatible with that currently used by block drivers. NB the indentation seen here after line continuations should not be used in reality, it is just for clarity in this commit message. Signed-off-by: Daniel P. Berrange--- qapi/qobject-input-visitor.c | 2 +- qom/object_interfaces.c | 38 - tests/check-qom-proplist.c | 367 ++- 3 files changed, 397 insertions(+), 10 deletions(-) diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index 0aef20e..a353d67 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -204,7 +204,7 @@ static void qobject_input_start_struct(Visitor *v, const char *name, void **obj, *obj = NULL; } -if (!qobj && (qiv->struct_level < qiv->autocreate_struct_levels)) { +if (!qobj && (qiv->struct_level <= qiv->autocreate_struct_levels)) { /* Create a new dict that contains all the currently * unvisited items */ if (tos) { diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c index f7afe49..faea1b1 100644 --- a/qom/object_interfaces.c +++ b/qom/object_interfaces.c @@ -4,7 +4,8 @@ #include "qemu/module.h" #include "qapi-visit.h" #include "qapi/qobject-output-visitor.h" -#include "qapi/opts-visitor.h" +#include "qapi/qobject-input-visitor.h" +#include "qemu/option.h" void user_creatable_complete(Object *obj, Error **errp) { @@ -63,12 +64,16 @@ Object *user_creatable_add(const QDict *qdict, if (local_err) { goto out_visit; } -visit_check_struct(v, _err); + +obj = user_creatable_add_type(type, id, pdict, v, _err); if (local_err) { goto out_visit; } -obj = user_creatable_add_type(type, id, pdict, v, _err); +visit_check_struct(v, _err); +if (local_err) { +goto out_visit; +} out_visit: visit_end_struct(v, NULL); @@ -114,7 +119,7 @@ Object *user_creatable_add_type(const char *type, const char *id, assert(qdict); obj = object_new(type); -visit_start_struct(v, NULL, NULL, 0, _err); +visit_start_struct(v, "props", NULL, 0, _err); if (local_err) { goto out; } @@ -158,14 +163,31 @@ Object *user_creatable_add_opts(QemuOpts *opts, Error **errp) { Visitor *v; QDict *pdict; +QObject *pobj; Object *obj = NULL; -v = opts_visitor_new(opts); pdict = qemu_opts_to_qdict(opts, NULL, - QEMU_OPTS_REPEAT_POLICY_LAST); - -obj = user_creatable_add(pdict, v, errp); + QEMU_OPTS_REPEAT_POLICY_LIST); + +pobj = qdict_crumple(pdict, true, errp); +if (!pobj) { +goto cleanup; +} +/* + * We need autocreate_list=true + permit_int_ranges=true + * in order to maintain compat with OptsVisitor creation + * of the 'numa' object which had an int16List property. + * + * We need autocreate_structs=1, because at the CLI level + * we have the object type + properties in the same flat + * struct, even though at the QMP level they are nested. + */ +v = qobject_input_visitor_new_autocast(pobj, true, 1, true); + +obj = user_creatable_add(qobject_to_qdict(pobj), v, errp); visit_free(v); +qobject_decref(pobj); + cleanup: QDECREF(pdict); return obj; } diff --git a/tests/check-qom-proplist.c b/tests/check-qom-proplist.c index a16cefc..20eb1d1 100644 --- a/tests/check-qom-proplist.c +++ b/tests/check-qom-proplist.c @@ -22,7 +22,16 @@ #include "qapi/error.h" #include "qom/object.h" +#include "qom/object_interfaces.h" #include "qemu/module.h" +#include "qapi/visitor.h" +#include "qom/object_interfaces.h" +#include "qemu/option.h" +#include "qemu/config-file.h"
[Qemu-block] [PATCH v14 15/19] numa: convert to use QObjectInputVisitor for -numa
Switch away from using OptsVisitor to parse the -numa argument processing. This enables use of the modern list syntax for specifying CPUs. e.g. the old syntax -numa node,nodeid=0,cpus=0-3,cpus=8-11,mem=107 is equivalent to -numa node,nodeid=0,cpus.0=0,cpus.1=1,cpus.2=2,cpus.3=3,\ cpus.4=8,cpus.5=9,cpus.6=10,cpus.7=11,mem=107 Furthermore, the cli arg can now follow the QAPI schema nesting, so the above is equivalent to the canonical syntax: -numa type=node,data.nodeid=0,data.cpus.0=0,data.cpus.1=1,\ data.cpus.2=2,data.cpus.3=3,data.cpus.4=8,data.cpus.5=9,\ data.cpus.6=10,data.cpus.7=11,data.mem=107 A test case is added to cover argument parsing to validate that both the old and new syntax is correctly handled. Signed-off-by: Daniel P. Berrange--- include/sysemu/numa_int.h | 11 + numa.c| 36 +- stubs/Makefile.objs | 5 ++ stubs/exec.c | 6 +++ stubs/hostmem.c | 14 ++ stubs/memory.c| 41 stubs/qdev.c | 8 stubs/vl.c| 8 stubs/vmstate.c | 4 ++ tests/Makefile.include| 2 + tests/test-numa.c | 116 ++ 11 files changed, 240 insertions(+), 11 deletions(-) create mode 100644 include/sysemu/numa_int.h create mode 100644 stubs/exec.c create mode 100644 stubs/hostmem.c create mode 100644 stubs/memory.c create mode 100644 stubs/qdev.c create mode 100644 stubs/vl.c create mode 100644 tests/test-numa.c diff --git a/include/sysemu/numa_int.h b/include/sysemu/numa_int.h new file mode 100644 index 000..93160da --- /dev/null +++ b/include/sysemu/numa_int.h @@ -0,0 +1,11 @@ +#ifndef SYSEMU_NUMA_INT_H +#define SYSEMU_NUMA_INT_H + +#include "sysemu/numa.h" + +extern int have_memdevs; +extern int max_numa_nodeid; + +int parse_numa(void *opaque, QemuOpts *opts, Error **errp); + +#endif diff --git a/numa.c b/numa.c index 6289f46..5daa3d1 100644 --- a/numa.c +++ b/numa.c @@ -23,14 +23,14 @@ */ #include "qemu/osdep.h" -#include "sysemu/numa.h" +#include "sysemu/numa_int.h" #include "exec/cpu-common.h" #include "qemu/bitmap.h" #include "qom/cpu.h" #include "qemu/error-report.h" #include "include/exec/cpu-common.h" /* for RAM_ADDR_FMT */ #include "qapi-visit.h" -#include "qapi/opts-visitor.h" +#include "qapi/qobject-input-visitor.h" #include "hw/boards.h" #include "sysemu/hostmem.h" #include "qmp-commands.h" @@ -45,10 +45,10 @@ QemuOptsList qemu_numa_opts = { .desc = { { 0 } } /* validated with OptsVisitor */ }; -static int have_memdevs = -1; -static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one. - * For all nodes, nodeid < max_numa_nodeid - */ +int have_memdevs = -1; +int max_numa_nodeid; /* Highest specified NUMA node ID, plus one. + * For all nodes, nodeid < max_numa_nodeid + */ int nb_numa_nodes; NodeInfo numa_info[MAX_NODES]; @@ -189,6 +189,9 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp) if (node->has_mem) { uint64_t mem_size = node->mem; const char *mem_str = qemu_opt_get(opts, "mem"); +if (!mem_str) { +mem_str = qemu_opt_get(opts, "data.mem"); +} /* Fix up legacy suffix-less format */ if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) { mem_size <<= 20; @@ -211,16 +214,27 @@ static void numa_node_parse(NumaNodeOptions *node, QemuOpts *opts, Error **errp) max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1); } -static int parse_numa(void *opaque, QemuOpts *opts, Error **errp) +int parse_numa(void *opaque, QemuOpts *opts, Error **errp) { NumaOptions *object = NULL; Error *err = NULL; +Visitor *v; -{ -Visitor *v = opts_visitor_new(opts); -visit_type_NumaOptions(v, NULL, , ); -visit_free(v); +/* + * Needs autocreate_list=true and permit_int_ranges=true + * in order to support existing syntax for 'cpus' parameter + * which is an int list. + * + * Needs autocreate_struct_levels=1, because existing syntax + * used a nested struct in the QMP schema with a flat namespace + * in the CLI args. + */ +v = qobject_input_visitor_new_opts(opts, true, 1, true, ); +if (err) { +goto end; } +visit_type_NumaOptions(v, NULL, , ); +visit_free(v); if (err) { goto end; diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index c5850e8..661b48a 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -48,3 +48,8 @@ stub-obj-y += iohandler.o stub-obj-y += smbios_type_38.o stub-obj-y += ipmi.o stub-obj-y += pc_madt_cpu_entry.o +stub-obj-y += vl.o +stub-obj-y += exec.o +stub-obj-y += memory.o +stub-obj-y += hostmem.o +stub-obj-y += qdev.o diff --git a/stubs/exec.c
[Qemu-block] [PATCH v14 17/19] acpi: convert to QObjectInputVisitor for -acpi parsing
The -acpi command line option parsing uses the OptsVisitor currently. This is easily replaced by the QObjectInputVisitor instead. There is no need to enable any of the compatibility options, since the AcpiTableOptions QAPI struct only contains scalar properties. Signed-off-by: Daniel P. Berrange--- hw/acpi/core.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/hw/acpi/core.c b/hw/acpi/core.c index e890a5d..480d3dd 100644 --- a/hw/acpi/core.c +++ b/hw/acpi/core.c @@ -25,7 +25,7 @@ #include "hw/acpi/acpi.h" #include "hw/nvram/fw_cfg.h" #include "qemu/config-file.h" -#include "qapi/opts-visitor.h" +#include "qapi/qobject-input-visitor.h" #include "qapi-visit.h" #include "qapi-event.h" @@ -237,14 +237,14 @@ void acpi_table_add(const QemuOpts *opts, Error **errp) char **cur; size_t bloblen = 0; char unsigned *blob = NULL; +Visitor *v; -{ -Visitor *v; - -v = opts_visitor_new(opts); -visit_type_AcpiTableOptions(v, NULL, , ); -visit_free(v); +v = qobject_input_visitor_new_opts(opts, false, 0, false, ); +if (err) { +goto out; } +visit_type_AcpiTableOptions(v, NULL, , ); +visit_free(v); if (err) { goto out; -- 2.7.4
[Qemu-block] [PATCH v14 14/19] hmp: support non-scalar properties with object_add
The current object_add HMP syntax only allows for creation of objects with scalar properties, or a list with a fixed scalar element type. Objects which have properties that are represented as structs in the QAPI schema cannot be created using -object. This is a design limitation of the way the OptsVisitor is written. It simply iterates over the QemuOpts values as a flat list. The support for lists is enabled by allowing the same key to be repeated in the opts string. The QObjectInputVisitor now has functionality that allows it to replace OptsVisitor, maintaining full backwards compatibility for previous CLI syntax, while also allowing use of new syntax for structs. Thus -object can now support non-scalar properties, for example the QMP object: { "execute": "object-add", "arguments": { "qom-type": "demo", "id": "demo0", "parameters": { "foo": [ { "bar": "one", "wizz": "1" }, { "bar": "two", "wizz": "2" } ] } } } Would be creatable via the HMP now using object_add demo,id=demo0,\ foo.0.bar=one,foo.0.wizz=1,\ foo.1.bar=two,foo.1.wizz=2 Notice that this syntax is intentionally compatible with that currently used by block drivers. NB the indentation seen here after line continuations should not be used in reality, it is just for clarity in this commit message. Signed-off-by: Daniel P. Berrange--- hmp.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/hmp.c b/hmp.c index ad33b44..da77d4c 100644 --- a/hmp.c +++ b/hmp.c @@ -25,7 +25,7 @@ #include "qemu/sockets.h" #include "monitor/monitor.h" #include "monitor/qdev.h" -#include "qapi/opts-visitor.h" +#include "qapi/qobject-input-visitor.h" #include "qapi/qmp/qerror.h" #include "qapi/string-output-visitor.h" #include "qapi/util.h" @@ -1695,21 +1695,30 @@ void hmp_netdev_del(Monitor *mon, const QDict *qdict) void hmp_object_add(Monitor *mon, const QDict *qdict) { Error *err = NULL; -QemuOpts *opts; Visitor *v; Object *obj = NULL; +QObject *pobj; -opts = qemu_opts_from_qdict(qemu_find_opts("object"), qdict, ); +pobj = qdict_crumple(qdict, true, ); if (err) { -hmp_handle_error(mon, ); -return; +goto cleanup; } -v = opts_visitor_new(opts); -obj = user_creatable_add(qdict, v, ); +/* + * We need autocreate_list=true + permit_int_ranges=true + * in order to maintain compat with OptsVisitor creation + * of the 'numa' object which had an int16List property. + * + * We need autocreate_structs=1, because at the CLI level + * we have the object type + properties in the same flat + * struct, even though at the QMP level they are nested. + */ +v = qobject_input_visitor_new_autocast(pobj, true, 1, true); +obj = user_creatable_add(qobject_to_qdict(pobj), v, ); visit_free(v); -qemu_opts_del(opts); + cleanup: +qobject_decref(pobj); if (err) { hmp_handle_error(mon, ); } -- 2.7.4
[Qemu-block] [PATCH v14 12/19] qapi: allow QObjectInputVisitor to be created with QemuOpts
Instead of requiring all callers to go through the mutli-step process of turning QemuOpts into a suitable QObject for visiting, add a new constructor that encapsulates this logic. This will allow QObjectInputVisitor to be a drop-in replacement for the existing OptsVisitor with minimal code changes for callers. Signed-off-by: Daniel P. Berrange--- include/qapi/qobject-input-visitor.h | 19 +++ include/qemu/option.h| 2 +- qapi/qobject-input-visitor.c | 29 + util/qemu-option.c | 2 +- 4 files changed, 50 insertions(+), 2 deletions(-) diff --git a/include/qapi/qobject-input-visitor.h b/include/qapi/qobject-input-visitor.h index 63f3782..242b767 100644 --- a/include/qapi/qobject-input-visitor.h +++ b/include/qapi/qobject-input-visitor.h @@ -102,4 +102,23 @@ Visitor *qobject_input_visitor_new_autocast(QObject *obj, size_t autocreate_struct_levels, bool permit_int_ranges); + +/** + * Create a new input visitor that converts @opts to a QAPI object. + * + * The QemuOpts will be converted into a QObject using the + * qdict_crumple() method to automatically create structs + * and lists. The resulting QDict will then be passed to the + * qobject_input_visitor_new_autocast() method. See the docs + * of that method for further details on processing behaviour. + * + * The returned input visitor should be released by calling + * visit_free() when no longer required. + */ +Visitor *qobject_input_visitor_new_opts(const QemuOpts *opts, +bool autocreate_list, +size_t autocreate_struct_levels, +bool permit_int_ranges, +Error **errp); + #endif diff --git a/include/qemu/option.h b/include/qemu/option.h index 328c468..bf1f078 100644 --- a/include/qemu/option.h +++ b/include/qemu/option.h @@ -130,7 +130,7 @@ typedef enum { QEMU_OPTS_REPEAT_POLICY_LIST, } QemuOptsRepeatPolicy; -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict, +QDict *qemu_opts_to_qdict(const QemuOpts *opts, QDict *qdict, QemuOptsRepeatPolicy repeatPolicy); void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp); diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index a38e779..0aef20e 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -747,3 +747,32 @@ Visitor *qobject_input_visitor_new_autocast(QObject *obj, return >visitor; } + + +Visitor *qobject_input_visitor_new_opts(const QemuOpts *opts, +bool autocreate_list, +size_t autocreate_struct_levels, +bool permit_int_ranges, +Error **errp) +{ +QDict *pdict; +QObject *pobj; +Visitor *v = NULL; + +pdict = qemu_opts_to_qdict(opts, NULL, + QEMU_OPTS_REPEAT_POLICY_LIST); + +pobj = qdict_crumple(pdict, true, errp); +if (!pobj) { +goto cleanup; +} + +v = qobject_input_visitor_new_autocast(pobj, + autocreate_list, + autocreate_struct_levels, + permit_int_ranges); + cleanup: +qobject_decref(pobj); +QDECREF(pdict); +return v; +} diff --git a/util/qemu-option.c b/util/qemu-option.c index ad28d4e..db0fef2 100644 --- a/util/qemu-option.c +++ b/util/qemu-option.c @@ -1058,7 +1058,7 @@ void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp) * TODO We'll want to use types appropriate for opt->desc->type, but * this is enough for now. */ -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict, +QDict *qemu_opts_to_qdict(const QemuOpts *opts, QDict *qdict, QemuOptsRepeatPolicy repeatPolicy) { QemuOpt *opt; -- 2.7.4
[Qemu-block] [PATCH v14 10/19] qapi: permit auto-creating nested structs
Some of the historical command line opts that had their keys in in a completely flat namespace are now represented by QAPI schemas that use a nested structs. When converting the QemuOpts to QObject, there is no information about compound types available, so the QObject will be completely flat, even after the qdict_crumple() call. So when starting a struct, we may not have a QDict available in the input data, so we auto-create a QDict containing all the currently unvisited input data keys. Not all historical command line opts require this, so the behaviour is opt-in, by specifying how many levels of structs are permitted to be auto-created. Note that this only works if the child struct is the last field to the visited in the parent struct. This is always the case for currently existing legacy command line options. The example is the NetLegacy type which has 3 levels of structs. The modern way to represent this in QemuOpts would be the dot-separated component approach -net vlan=1,id=foo,name=bar,opts.type=tap,\ opts.data.fd=3,opts.data.script=ifup The legacy syntax will just be presenting -net vlan=1,id=foo,name=bar,type=tap,fd=3,script=ifup So we need to auto-create 3 levels of struct when visiting. The implementation here will enable visiting in both the modern and legacy syntax, compared to OptsVisitor which only allows the legacy syntax. Signed-off-by: Daniel P. Berrange--- include/qapi/qobject-input-visitor.h | 21 +- qapi/qobject-input-visitor.c | 55 +-- tests/test-qobject-input-visitor.c | 130 --- 3 files changed, 190 insertions(+), 16 deletions(-) diff --git a/include/qapi/qobject-input-visitor.h b/include/qapi/qobject-input-visitor.h index fb52457..90989f4 100644 --- a/include/qapi/qobject-input-visitor.h +++ b/include/qapi/qobject-input-visitor.h @@ -45,7 +45,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); * If @autocreate_list is true, then as an alternative to a normal QList, * list values can be stored as a QString or QDict instead, which will * be interpreted as representing single element lists. This should only - * by used if compatibility is required with the OptsVisitor which allowed + * be used if compatibility is required with the OptsVisitor which allowed * repeated keys, without list indexes, to represent lists. e.g. set this * to true if you have compatibility requirements for * @@ -55,6 +55,22 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); * * -arg foo.0=hello,foo.1=world * + * If @autocreate_struct_levels is non-zero, then when visiting structs, + * if the corresponding QDict is not found, it will automatically create + * a QDict containing all remaining unvisited options. This should only + * be used if compatibility is required with the OptsVisitor which flatten + * structs so that all keys were at the same level. e.g. set this to a + * non-zero number if you compatibility requirements for + * + * -arg type=person,surname=blogs,forename=fred + * + * to be treated as equivalent to the perferred syntax + * + * -arg type=person,data.surname=blogs,data.forename=fred + * + * The value given determines how many levels of structs are allowed to + * be flattened in this way. + * * The visitor always operates in strict mode, requiring all dict keys * to be consumed during visitation. An error will be reported if this * does not happen. @@ -63,6 +79,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); * visit_free() when no longer required. */ Visitor *qobject_input_visitor_new_autocast(QObject *obj, -bool autocreate_list); +bool autocreate_list, +size_t autocreate_struct_levels); #endif diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index e8afd1e..6cacd4b 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -52,6 +52,14 @@ struct QObjectInputVisitor /* Whether we can auto-create single element lists when * encountering a non-QList type */ bool autocreate_list; + +/* Current depth of recursion into structs */ +size_t struct_level; + +/* Numbers of levels at which we will + * consider auto-creating a struct containing + * remaining unvisited items */ +size_t autocreate_struct_levels; }; static QObjectInputVisitor *to_qiv(Visitor *v) @@ -79,7 +87,12 @@ static QObject *qobject_input_get_object(QObjectInputVisitor *qiv, if (qobject_type(qobj) == QTYPE_QDICT) { assert(name); -ret = qdict_get(qobject_to_qdict(qobj), name); +if (qiv->autocreate_struct_levels && +!g_hash_table_contains(tos->h, name)) { +ret = NULL; +} else { +ret = qdict_get(qobject_to_qdict(qobj), name); +} if (tos->h
[Qemu-block] [PATCH v14 08/19] qapi: permit scalar type conversions in QObjectInputVisitor
Currently the QObjectInputVisitor assumes that all scalar values are directly represented as the final types declared by the thing being visited. ie it assumes an 'int' is using QInt, and a 'bool' is using QBool, etc. This is good when QObjectInputVisitor is fed a QObject that came from a JSON document on the QMP monitor, as it will strictly validate correctness. To allow QObjectInputVisitor to be reused for visiting a QObject originating from QemuOpts, an alternative mode is needed where all the scalars types are represented as QString and converted on the fly to the final desired type. Reviewed-by: Kevin WolfReviewed-by: Marc-André Lureau Signed-off-by: Daniel P. Berrange --- include/qapi/qobject-input-visitor.h | 32 +- qapi/qobject-input-visitor.c | 132 tests/test-qobject-input-visitor.c | 194 ++- 3 files changed, 350 insertions(+), 8 deletions(-) diff --git a/include/qapi/qobject-input-visitor.h b/include/qapi/qobject-input-visitor.h index cde328d..5022297 100644 --- a/include/qapi/qobject-input-visitor.h +++ b/include/qapi/qobject-input-visitor.h @@ -19,12 +19,36 @@ typedef struct QObjectInputVisitor QObjectInputVisitor; -/* - * Return a new input visitor that converts a QObject to a QAPI object. +/** + * Create a new input visitor that converts @obj to a QAPI object. + * + * Any scalar values in the @obj input data structure should be in the + * required type already. i.e. if visiting a bool, the value should + * already be a QBool instance. * - * Set @strict to reject a parse that doesn't consume all keys of a - * dictionary; otherwise excess input is ignored. + * If @strict is set to true, then an error will be reported if any + * dict keys are not consumed during visitation. If @strict is false + * then extra dict keys are silently ignored. + * + * The returned input visitor should be released by calling + * visit_free() when no longer required. */ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); +/** + * Create a new input visitor that converts @obj to a QAPI object. + * + * Any scalar values in the @obj input data structure should always be + * represented as strings. i.e. if visiting a boolean, the value should + * be a QString whose contents represent a valid boolean. + * + * The visitor always operates in strict mode, requiring all dict keys + * to be consumed during visitation. An error will be reported if this + * does not happen. + * + * The returned input visitor should be released by calling + * visit_free() when no longer required. + */ +Visitor *qobject_input_visitor_new_autocast(QObject *obj); + #endif diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index 5ff3db3..cf41df6 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -20,6 +20,7 @@ #include "qemu-common.h" #include "qapi/qmp/types.h" #include "qapi/qmp/qerror.h" +#include "qemu/cutils.h" #define QIV_STACK_SIZE 1024 @@ -263,6 +264,28 @@ static void qobject_input_type_int64(Visitor *v, const char *name, int64_t *obj, *obj = qint_get_int(qint); } + +static void qobject_input_type_int64_autocast(Visitor *v, const char *name, + int64_t *obj, Error **errp) +{ +QObjectInputVisitor *qiv = to_qiv(v); +QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name, +true)); +int64_t ret; + +if (!qstr || !qstr->string) { +error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null", + "string"); +return; +} + +if (qemu_strtoll(qstr->string, NULL, 0, ) < 0) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name, "a number"); +return; +} +*obj = ret; +} + static void qobject_input_type_uint64(Visitor *v, const char *name, uint64_t *obj, Error **errp) { @@ -279,6 +302,27 @@ static void qobject_input_type_uint64(Visitor *v, const char *name, *obj = qint_get_int(qint); } +static void qobject_input_type_uint64_autocast(Visitor *v, const char *name, + uint64_t *obj, Error **errp) +{ +QObjectInputVisitor *qiv = to_qiv(v); +QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name, +true)); +unsigned long long ret; + +if (!qstr || !qstr->string) { +error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null", + "string"); +return; +} + +if (parse_uint_full(qstr->string, , 0) < 0) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name, "a number"); +return; +} +*obj = ret; +} + static void qobject_input_type_bool(Visitor *v, const char *name, bool
[Qemu-block] [PATCH v14 07/19] qapi: don't pass two copies of TestInputVisitorData to tests
The input_visitor_test_add() method was accepting an instance of 'TestInputVisitorData' and passing it as the 'user_data' parameter to test functions. The main 'TestInputVisitorData' instance that was actually used, was meanwhile being allocated automatically by the test framework fixture setup. Signed-off-by: Daniel P. Berrange--- tests/test-qobject-input-visitor.c | 76 -- 1 file changed, 32 insertions(+), 44 deletions(-) diff --git a/tests/test-qobject-input-visitor.c b/tests/test-qobject-input-visitor.c index 0e65e63..26c5012 100644 --- a/tests/test-qobject-input-visitor.c +++ b/tests/test-qobject-input-visitor.c @@ -747,10 +747,11 @@ static void test_visitor_in_native_list_number(TestInputVisitorData *data, } static void input_visitor_test_add(const char *testpath, - TestInputVisitorData *data, - void (*test_func)(TestInputVisitorData *data, const void *user_data)) + const void *user_data, + void (*test_func)(TestInputVisitorData *data, + const void *user_data)) { -g_test_add(testpath, TestInputVisitorData, data, NULL, test_func, +g_test_add(testpath, TestInputVisitorData, user_data, NULL, test_func, visitor_input_teardown); } @@ -833,77 +834,64 @@ static void test_visitor_in_wrong_type(TestInputVisitorData *data, int main(int argc, char **argv) { -TestInputVisitorData in_visitor_data; - g_test_init(, , NULL); input_visitor_test_add("/visitor/input/int", - _visitor_data, test_visitor_in_int); + NULL, test_visitor_in_int); input_visitor_test_add("/visitor/input/int_overflow", - _visitor_data, test_visitor_in_int_overflow); + NULL, test_visitor_in_int_overflow); input_visitor_test_add("/visitor/input/bool", - _visitor_data, test_visitor_in_bool); + NULL, test_visitor_in_bool); input_visitor_test_add("/visitor/input/number", - _visitor_data, test_visitor_in_number); + NULL, test_visitor_in_number); input_visitor_test_add("/visitor/input/string", - _visitor_data, test_visitor_in_string); + NULL, test_visitor_in_string); input_visitor_test_add("/visitor/input/enum", - _visitor_data, test_visitor_in_enum); + NULL, test_visitor_in_enum); input_visitor_test_add("/visitor/input/struct", - _visitor_data, test_visitor_in_struct); + NULL, test_visitor_in_struct); input_visitor_test_add("/visitor/input/struct-nested", - _visitor_data, test_visitor_in_struct_nested); + NULL, test_visitor_in_struct_nested); input_visitor_test_add("/visitor/input/list", - _visitor_data, test_visitor_in_list); + NULL, test_visitor_in_list); input_visitor_test_add("/visitor/input/any", - _visitor_data, test_visitor_in_any); + NULL, test_visitor_in_any); input_visitor_test_add("/visitor/input/null", - _visitor_data, test_visitor_in_null); + NULL, test_visitor_in_null); input_visitor_test_add("/visitor/input/union-flat", - _visitor_data, test_visitor_in_union_flat); + NULL, test_visitor_in_union_flat); input_visitor_test_add("/visitor/input/alternate", - _visitor_data, test_visitor_in_alternate); + NULL, test_visitor_in_alternate); input_visitor_test_add("/visitor/input/errors", - _visitor_data, test_visitor_in_errors); + NULL, test_visitor_in_errors); input_visitor_test_add("/visitor/input/wrong-type", - _visitor_data, test_visitor_in_wrong_type); + NULL, test_visitor_in_wrong_type); input_visitor_test_add("/visitor/input/alternate-number", - _visitor_data, test_visitor_in_alternate_number); + NULL, test_visitor_in_alternate_number); input_visitor_test_add("/visitor/input/native_list/int", - _visitor_data, - test_visitor_in_native_list_int); + NULL, test_visitor_in_native_list_int); input_visitor_test_add("/visitor/input/native_list/int8", - _visitor_data, - test_visitor_in_native_list_int8); +
[Qemu-block] [PATCH v14 04/19] qapi: add trace events for visitor
Allow tracing of the operation of visitors Signed-off-by: Daniel P. Berrange--- Makefile.objs | 1 + qapi/qapi-visit-core.c | 27 +++ qapi/trace-events | 33 + 3 files changed, 61 insertions(+) create mode 100644 qapi/trace-events diff --git a/Makefile.objs b/Makefile.objs index 7301544..54da068 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -160,3 +160,4 @@ trace-events-y += target-s390x/trace-events trace-events-y += target-ppc/trace-events trace-events-y += qom/trace-events trace-events-y += linux-user/trace-events +trace-events-y += qapi/trace-events diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c index 55f5876..bfcb276 100644 --- a/qapi/qapi-visit-core.c +++ b/qapi/qapi-visit-core.c @@ -19,10 +19,12 @@ #include "qapi/qmp/qerror.h" #include "qapi/visitor.h" #include "qapi/visitor-impl.h" +#include "trace.h" void visit_complete(Visitor *v, void *opaque) { assert(v->type != VISITOR_OUTPUT || v->complete); +trace_visit_complete(v, opaque); if (v->complete) { v->complete(v, opaque); } @@ -30,6 +32,7 @@ void visit_complete(Visitor *v, void *opaque) void visit_free(Visitor *v) { +trace_visit_free(v); if (v) { v->free(v); } @@ -40,6 +43,7 @@ void visit_start_struct(Visitor *v, const char *name, void **obj, { Error *err = NULL; +trace_visit_start_struct(v, name, obj, size); if (obj) { assert(size); assert(!(v->type & VISITOR_OUTPUT) || *obj); @@ -53,6 +57,7 @@ void visit_start_struct(Visitor *v, const char *name, void **obj, void visit_check_struct(Visitor *v, Error **errp) { +trace_visit_check_struct(v); if (v->check_struct) { v->check_struct(v, errp); } @@ -60,6 +65,7 @@ void visit_check_struct(Visitor *v, Error **errp) void visit_end_struct(Visitor *v, void **obj) { +trace_visit_end_struct(v, obj); v->end_struct(v, obj); } @@ -69,6 +75,7 @@ void visit_start_list(Visitor *v, const char *name, GenericList **list, Error *err = NULL; assert(!list || size >= sizeof(GenericList)); +trace_visit_start_list(v, name, list, size); v->start_list(v, name, list, size, ); if (list && (v->type & VISITOR_INPUT)) { assert(!(err && *list)); @@ -79,11 +86,13 @@ void visit_start_list(Visitor *v, const char *name, GenericList **list, GenericList *visit_next_list(Visitor *v, GenericList *tail, size_t size) { assert(tail && size >= sizeof(GenericList)); +trace_visit_next_list(v, tail, size); return v->next_list(v, tail, size); } void visit_end_list(Visitor *v, void **obj) { +trace_visit_end_list(v, obj); v->end_list(v, obj); } @@ -95,6 +104,7 @@ void visit_start_alternate(Visitor *v, const char *name, assert(obj && size >= sizeof(GenericAlternate)); assert(!(v->type & VISITOR_OUTPUT) || *obj); +trace_visit_start_alternate(v, name, obj, size, promote_int); if (v->start_alternate) { v->start_alternate(v, name, obj, size, promote_int, ); } @@ -106,6 +116,7 @@ void visit_start_alternate(Visitor *v, const char *name, void visit_end_alternate(Visitor *v, void **obj) { +trace_visit_end_alternate(v, obj); if (v->end_alternate) { v->end_alternate(v, obj); } @@ -113,6 +124,7 @@ void visit_end_alternate(Visitor *v, void **obj) bool visit_optional(Visitor *v, const char *name, bool *present) { +trace_visit_optional(v, name, present); if (v->optional) { v->optional(v, name, present); } @@ -127,6 +139,7 @@ bool visit_is_input(Visitor *v) void visit_type_int(Visitor *v, const char *name, int64_t *obj, Error **errp) { assert(obj); +trace_visit_type_int(v, name, obj); v->type_int64(v, name, obj, errp); } @@ -151,6 +164,7 @@ void visit_type_uint8(Visitor *v, const char *name, uint8_t *obj, Error **errp) { uint64_t value = *obj; +trace_visit_type_uint8(v, name, obj); visit_type_uintN(v, , name, UINT8_MAX, "uint8_t", errp); *obj = value; } @@ -159,6 +173,7 @@ void visit_type_uint16(Visitor *v, const char *name, uint16_t *obj, Error **errp) { uint64_t value = *obj; +trace_visit_type_uint16(v, name, obj); visit_type_uintN(v, , name, UINT16_MAX, "uint16_t", errp); *obj = value; } @@ -167,6 +182,7 @@ void visit_type_uint32(Visitor *v, const char *name, uint32_t *obj, Error **errp) { uint64_t value = *obj; +trace_visit_type_uint32(v, name, obj); visit_type_uintN(v, , name, UINT32_MAX, "uint32_t", errp); *obj = value; } @@ -175,6 +191,7 @@ void visit_type_uint64(Visitor *v, const char *name, uint64_t *obj, Error **errp) { assert(obj); +trace_visit_type_uint64(v, name, obj); v->type_uint64(v, name, obj, errp); } @@ -199,6 +216,7 @@ static void
[Qemu-block] [PATCH v14 11/19] qapi: add integer range support for QObjectInputVisitor
The traditional CLI arg syntax allows two ways to specify integer lists, either one value per key, or a range of values per key. eg the following are identical: -arg foo=5,foo=6,foo=7 -arg foo=5-7 This extends the QObjectInputVisitor so that it is able to parse ranges and turn them into distinct list entries. This means that -arg foo=5-7 is treated as equivalent to -arg foo.0=5,foo.1=6,foo.2=7 Edge case tests are copied from test-opts-visitor to ensure identical behaviour when parsing. Signed-off-by: Daniel P. Berrange--- include/qapi/qobject-input-visitor.h | 22 +++- qapi/qobject-input-visitor.c | 154 +-- tests/test-qobject-input-visitor.c | 195 +-- 3 files changed, 356 insertions(+), 15 deletions(-) diff --git a/include/qapi/qobject-input-visitor.h b/include/qapi/qobject-input-visitor.h index 90989f4..63f3782 100644 --- a/include/qapi/qobject-input-visitor.h +++ b/include/qapi/qobject-input-visitor.h @@ -19,6 +19,12 @@ typedef struct QObjectInputVisitor QObjectInputVisitor; +/* Inclusive upper bound on the size of any flattened range. This is a safety + * (= anti-annoyance) measure; wrong ranges should not cause long startup + * delays nor exhaust virtual memory. + */ +#define QIV_RANGE_MAX 65536 + /** * Create a new input visitor that converts @obj to a QAPI object. * @@ -71,6 +77,19 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); * The value given determines how many levels of structs are allowed to * be flattened in this way. * + * If @permit_int_ranges is true, then when visiting a list of integers, + * the integer value strings may encode ranges eg a single element + * containing "5-7" is treated as if there were three elements "5", "6", + * "7". This should only be used if compatibility is required with the + * OptsVisitor which would allow integer ranges. e.g. set this to true + * if you have compatibility requirements for + * + * -arg val=5-8 + * + * to be treated as equivalent to the preferred syntax: + * + * -arg val.0=5,val.1=6,val.2=7,val.3=8 + * * The visitor always operates in strict mode, requiring all dict keys * to be consumed during visitation. An error will be reported if this * does not happen. @@ -80,6 +99,7 @@ Visitor *qobject_input_visitor_new(QObject *obj, bool strict); */ Visitor *qobject_input_visitor_new_autocast(QObject *obj, bool autocreate_list, -size_t autocreate_struct_levels); +size_t autocreate_struct_levels, +bool permit_int_ranges); #endif diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index 6cacd4b..a38e779 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -31,6 +31,8 @@ typedef struct StackObject GHashTable *h; /* If obj is dict: unvisited keys */ const QListEntry *entry; /* If obj is list: unvisited tail */ +uint64_t range_val; +uint64_t range_limit; QSLIST_ENTRY(StackObject) node; } StackObject; @@ -60,6 +62,10 @@ struct QObjectInputVisitor * consider auto-creating a struct containing * remaining unvisited items */ size_t autocreate_struct_levels; + +/* Whether int lists can have single values representing + * ranges of values */ +bool permit_int_ranges; }; static QObjectInputVisitor *to_qiv(Visitor *v) @@ -282,7 +288,7 @@ static GenericList *qobject_input_next_list(Visitor *v, GenericList *tail, QObjectInputVisitor *qiv = to_qiv(v); StackObject *so = QSLIST_FIRST(>stack); -if (!so->entry) { +if ((so->range_val == so->range_limit) && !so->entry) { return NULL; } tail->next = g_malloc0(size); @@ -329,21 +335,87 @@ static void qobject_input_type_int64_autocast(Visitor *v, const char *name, int64_t *obj, Error **errp) { QObjectInputVisitor *qiv = to_qiv(v); -QString *qstr = qobject_to_qstring(qobject_input_get_object(qiv, name, -true)); +QString *qstr; int64_t ret; +const char *end = NULL; +StackObject *tos; +bool inlist = false; + +/* Preferentially generate values from a range, before + * trying to consume another QList element */ +tos = QSLIST_FIRST(>stack); +if (tos) { +if ((int64_t)tos->range_val < (int64_t)tos->range_limit) { +*obj = tos->range_val + 1; +tos->range_val++; +return; +} else { +inlist = tos->entry != NULL; +} +} +qstr = qobject_to_qstring(qobject_input_get_object(qiv, name, + true)); if (!qstr || !qstr->string) { error_setg(errp,
[Qemu-block] [PATCH v14 06/19] qapi: rename QmpOutputVisitor to QObjectOutputVisitor
The QmpOutputVisitor has no direct dependency on QMP. It is valid to use it anywhere that one wants a QObject. Rename it to better reflect its functionality as a generic QAPI to QObject converter. Reviewed-by: Kevin WolfReviewed-by: Eric Blake Signed-off-by: Daniel P. Berrange --- block/qapi.c | 4 +- blockdev.c | 4 +- docs/qapi-code-gen.txt | 2 +- ...p-output-visitor.h => qobject-output-visitor.h} | 10 +- qapi/Makefile.objs | 2 +- qapi/qapi-clone-visitor.c | 2 +- qapi/qmp-output-visitor.c | 256 - qapi/qobject-output-visitor.c | 254 qemu-img.c | 8 +- qom/object_interfaces.c| 2 +- qom/qom-qobject.c | 4 +- scripts/qapi-commands.py | 4 +- scripts/qapi-event.py | 4 +- tests/.gitignore | 2 +- tests/Makefile.include | 8 +- tests/check-qnull.c| 4 +- ...put-visitor.c => test-qobject-output-visitor.c} | 6 +- tests/test-string-output-visitor.c | 2 +- tests/test-visitor-serialization.c | 4 +- util/qemu-sockets.c| 2 +- 20 files changed, 291 insertions(+), 293 deletions(-) rename include/qapi/{qmp-output-visitor.h => qobject-output-visitor.h} (66%) delete mode 100644 qapi/qmp-output-visitor.c create mode 100644 qapi/qobject-output-visitor.c rename tests/{test-qmp-output-visitor.c => test-qobject-output-visitor.c} (99%) diff --git a/block/qapi.c b/block/qapi.c index 6f947e3..f35c89f 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -29,7 +29,7 @@ #include "block/write-threshold.h" #include "qmp-commands.h" #include "qapi-visit.h" -#include "qapi/qmp-output-visitor.h" +#include "qapi/qobject-output-visitor.h" #include "qapi/qmp/types.h" #include "sysemu/block-backend.h" #include "qemu/cutils.h" @@ -691,7 +691,7 @@ void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f, ImageInfoSpecific *info_spec) { QObject *obj, *data; -Visitor *v = qmp_output_visitor_new(); +Visitor *v = qobject_output_visitor_new(); visit_type_ImageInfoSpecific(v, NULL, _spec, _abort); visit_complete(v, ); diff --git a/blockdev.c b/blockdev.c index 5ef3193..3f5d528 100644 --- a/blockdev.c +++ b/blockdev.c @@ -43,7 +43,7 @@ #include "qapi/qmp/types.h" #include "qapi-visit.h" #include "qapi/qmp/qerror.h" -#include "qapi/qmp-output-visitor.h" +#include "qapi/qobject-output-visitor.h" #include "qapi/util.h" #include "sysemu/sysemu.h" #include "block/block_int.h" @@ -3784,7 +3784,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp) BlockDriverState *bs; BlockBackend *blk = NULL; QObject *obj; -Visitor *v = qmp_output_visitor_new(); +Visitor *v = qobject_output_visitor_new(); QDict *qdict; Error *local_err = NULL; diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt index d2604b6..2841c51 100644 --- a/docs/qapi-code-gen.txt +++ b/docs/qapi-code-gen.txt @@ -1005,7 +1005,7 @@ Example: Error *err = NULL; Visitor *v; -v = qmp_output_visitor_new(ret_out); +v = qobject_output_visitor_new(ret_out); visit_type_UserDefOne(v, "unused", _in, ); if (!err) { visit_complete(v, ret_out); diff --git a/include/qapi/qmp-output-visitor.h b/include/qapi/qobject-output-visitor.h similarity index 66% rename from include/qapi/qmp-output-visitor.h rename to include/qapi/qobject-output-visitor.h index 040fdda..358c959 100644 --- a/include/qapi/qmp-output-visitor.h +++ b/include/qapi/qobject-output-visitor.h @@ -11,20 +11,20 @@ * */ -#ifndef QMP_OUTPUT_VISITOR_H -#define QMP_OUTPUT_VISITOR_H +#ifndef QOBJECT_OUTPUT_VISITOR_H +#define QOBJECT_OUTPUT_VISITOR_H #include "qapi/visitor.h" #include "qapi/qmp/qobject.h" -typedef struct QmpOutputVisitor QmpOutputVisitor; +typedef struct QObjectOutputVisitor QObjectOutputVisitor; /* - * Create a new QMP output visitor. + * Create a new QOBJECT output visitor. * * If everything else succeeds, pass @result to visit_complete() to * collect the result of the visit. */ -Visitor *qmp_output_visitor_new(QObject **result); +Visitor *qobject_output_visitor_new(QObject **result); #endif diff --git a/qapi/Makefile.objs b/qapi/Makefile.objs index 6ec7bdc..33906ff 100644 --- a/qapi/Makefile.objs +++ b/qapi/Makefile.objs @@ -1,5 +1,5 @@ util-obj-y = qapi-visit-core.o qapi-dealloc-visitor.o qobject-input-visitor.o -util-obj-y +=
[Qemu-block] [PATCH v14 05/19] qapi: rename QmpInputVisitor to QObjectInputVisitor
The QmpInputVisitor has no direct dependency on QMP. It is valid to use it anywhere that one has a QObject. Rename it to better reflect its functionality as a generic QObject to QAPI converter. Reviewed-by: Kevin WolfReviewed-by: Eric Blake Signed-off-by: Daniel P. Berrange --- docs/qapi-code-gen.txt | 2 +- ...qmp-input-visitor.h => qobject-input-visitor.h} | 10 +- include/qapi/visitor.h | 6 +- monitor.c | 2 +- qapi/Makefile.objs | 2 +- ...qmp-input-visitor.c => qobject-input-visitor.c} | 171 +++-- qmp.c | 4 +- qom/qom-qobject.c | 4 +- scripts/qapi-commands.py | 4 +- target-s390x/cpu_models.c | 4 +- tests/.gitignore | 4 +- tests/Makefile.include | 12 +- tests/check-qnull.c| 4 +- tests/test-qmp-commands.c | 4 +- ...-input-strict.c => test-qobject-input-strict.c} | 6 +- ...nput-visitor.c => test-qobject-input-visitor.c} | 6 +- tests/test-string-input-visitor.c | 2 +- tests/test-visitor-serialization.c | 4 +- util/qemu-sockets.c| 2 +- 19 files changed, 128 insertions(+), 125 deletions(-) rename include/qapi/{qmp-input-visitor.h => qobject-input-visitor.h} (63%) rename qapi/{qmp-input-visitor.c => qobject-input-visitor.c} (56%) rename tests/{test-qmp-input-strict.c => test-qobject-input-strict.c} (98%) rename tests/{test-qmp-input-visitor.c => test-qobject-input-visitor.c} (99%) diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt index 5d4c2cd..d2604b6 100644 --- a/docs/qapi-code-gen.txt +++ b/docs/qapi-code-gen.txt @@ -1024,7 +1024,7 @@ Example: Visitor *v; UserDefOneList *arg1 = NULL; -v = qmp_input_visitor_new(QOBJECT(args), true); +v = qobject_input_visitor_new(QOBJECT(args), true); visit_start_struct(v, NULL, NULL, 0, ); if (err) { goto out; diff --git a/include/qapi/qmp-input-visitor.h b/include/qapi/qobject-input-visitor.h similarity index 63% rename from include/qapi/qmp-input-visitor.h rename to include/qapi/qobject-input-visitor.h index f3ff5f3..cde328d 100644 --- a/include/qapi/qmp-input-visitor.h +++ b/include/qapi/qobject-input-visitor.h @@ -11,20 +11,20 @@ * */ -#ifndef QMP_INPUT_VISITOR_H -#define QMP_INPUT_VISITOR_H +#ifndef QOBJECT_INPUT_VISITOR_H +#define QOBJECT_INPUT_VISITOR_H #include "qapi/visitor.h" #include "qapi/qmp/qobject.h" -typedef struct QmpInputVisitor QmpInputVisitor; +typedef struct QObjectInputVisitor QObjectInputVisitor; /* - * Return a new input visitor that converts QMP to QAPI. + * Return a new input visitor that converts a QObject to a QAPI object. * * Set @strict to reject a parse that doesn't consume all keys of a * dictionary; otherwise excess input is ignored. */ -Visitor *qmp_input_visitor_new(QObject *obj, bool strict); +Visitor *qobject_input_visitor_new(QObject *obj, bool strict); #endif diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h index 6c77a91..9bb6cba 100644 --- a/include/qapi/visitor.h +++ b/include/qapi/visitor.h @@ -25,14 +25,14 @@ * for doing work at each node of a QAPI graph; it can also be used * for a virtual walk, where there is no actual QAPI C struct. * - * There are four kinds of visitor classes: input visitors (QMP, + * There are four kinds of visitor classes: input visitors (QObject, * string, and QemuOpts) parse an external representation and build - * the corresponding QAPI graph, output visitors (QMP and string) take + * the corresponding QAPI graph, output visitors (QObject and string) take * a completed QAPI graph and generate an external representation, the * dealloc visitor can take a QAPI graph (possibly partially * constructed) and recursively free its resources, and the clone * visitor performs a deep clone of one QAPI object to another. While - * the dealloc and QMP input/output visitors are general, the string, + * the dealloc and QObject input/output visitors are general, the string, * QemuOpts, and clone visitors have some implementation limitations; * see the documentation for each visitor for more details on what it * supports. Also, see visitor-impl.h for the callback contracts diff --git a/monitor.c b/monitor.c index 7dcd66b..792b995 100644 --- a/monitor.c +++ b/monitor.c @@ -957,7 +957,7 @@ EventInfoList *qmp_query_events(Error **errp) * directly into QObject instead of first parsing it with * visit_type_SchemaInfoList() into a SchemaInfoList, then marshal it * to QObject with generated output marshallers, every
[Qemu-block] [PATCH v14 01/19] qdict: implement a qdict_crumple method for un-flattening a dict
The qdict_flatten() method will take a dict whose elements are further nested dicts/lists and flatten them by concatenating keys. The qdict_crumple() method aims to do the reverse, taking a flat qdict, and turning it into a set of nested dicts/lists. It will apply nesting based on the key name, with a '.' indicating a new level in the hierarchy. If the keys in the nested structure are all numeric, it will create a list, otherwise it will create a dict. If the keys are a mixture of numeric and non-numeric, or the numeric keys are not in strictly ascending order, an error will be reported. As an example, a flat dict containing { 'foo.0.bar': 'one', 'foo.0.wizz': '1', 'foo.1.bar': 'two', 'foo.1.wizz': '2' } will get turned into a dict with one element 'foo' whose value is a list. The list elements will each in turn be dicts. { 'foo': [ { 'bar': 'one', 'wizz': '1' }, { 'bar': 'two', 'wizz': '2' } ], } If the key is intended to contain a literal '.', then it must be escaped as '..'. ie a flat dict { 'foo..bar': 'wizz', 'bar.foo..bar': 'eek', 'bar.hello': 'world' } Will end up as { 'foo.bar': 'wizz', 'bar': { 'foo.bar': 'eek', 'hello': 'world' } } The intent of this function is that it allows a set of QemuOpts to be turned into a nested data structure that mirrors the nesting used when the same object is defined over QMP. Reviewed-by: Kevin WolfReviewed-by: Marc-André Lureau Signed-off-by: Daniel P. Berrange --- include/qapi/qmp/qdict.h | 1 + qobject/qdict.c | 291 +++ tests/check-qdict.c | 260 ++ 3 files changed, 552 insertions(+) diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h index 71b8eb0..e0d24e1 100644 --- a/include/qapi/qmp/qdict.h +++ b/include/qapi/qmp/qdict.h @@ -73,6 +73,7 @@ void qdict_flatten(QDict *qdict); void qdict_extract_subqdict(QDict *src, QDict **dst, const char *start); void qdict_array_split(QDict *src, QList **dst); int qdict_array_entries(QDict *src, const char *subqdict); +QObject *qdict_crumple(const QDict *src, bool recursive, Error **errp); void qdict_join(QDict *dest, QDict *src, bool overwrite); diff --git a/qobject/qdict.c b/qobject/qdict.c index 60f158c..beef273 100644 --- a/qobject/qdict.c +++ b/qobject/qdict.c @@ -17,6 +17,7 @@ #include "qapi/qmp/qbool.h" #include "qapi/qmp/qstring.h" #include "qapi/qmp/qobject.h" +#include "qapi/error.h" #include "qemu/queue.h" #include "qemu-common.h" #include "qemu/cutils.h" @@ -683,6 +684,296 @@ void qdict_array_split(QDict *src, QList **dst) } } + +/** + * qdict_split_flat_key: + * @key: the key string to split + * @prefix: non-NULL pointer to hold extracted prefix + * @suffix: non-NULL pointer to remaining suffix + * + * Given a flattened key such as 'foo.0.bar', split it into two parts + * at the first '.' separator. Allows double dot ('..') to escape the + * normal separator. + * + * eg + *'foo.0.bar' -> prefix='foo' and suffix='0.bar' + *'foo..0.bar' -> prefix='foo.0' and suffix='bar' + * + * The '..' sequence will be unescaped in the returned 'prefix' + * string. The 'suffix' string will be left in escaped format, so it + * can be fed back into the qdict_split_flat_key() key as the input + * later. + * + * The caller is responsible for freeing the string returned in @prefix + * using g_free(). + */ +static void qdict_split_flat_key(const char *key, char **prefix, + const char **suffix) +{ +const char *separator; +size_t i, j; + +/* Find first '.' separator, but if there is a pair '..' + * that acts as an escape, so skip over '..' */ +separator = NULL; +do { +if (separator) { +separator += 2; +} else { +separator = key; +} +separator = strchr(separator, '.'); +} while (separator && separator[1] == '.'); + +if (separator) { +*prefix = g_strndup(key, separator - key); +*suffix = separator + 1; +} else { +*prefix = g_strdup(key); +*suffix = NULL; +} + +/* Unescape the '..' sequence into '.' */ +for (i = 0, j = 0; (*prefix)[i] != '\0'; i++, j++) { +if ((*prefix)[i] == '.') { +assert((*prefix)[i + 1] == '.'); +i++; +} +(*prefix)[j] = (*prefix)[i]; +} +(*prefix)[j] = '\0'; +} + + +/** + * qdict_is_list: + * @maybe_list: dict to check if keys represent list elements. + * + * Determine whether all keys in @maybe_list are valid list elements. + * If @maybe_list is non-zero in length and all the keys look like + * valid list indexes, this will return 1. If @maybe_list is zero + * length or all keys are non-numeric then it will return 0 to indicate + * it is a normal qdict. If there is a mix of numeric and
[Qemu-block] [PATCH v14 02/19] option: make parse_option_bool/number non-static
The opts-visitor.c opts_type_bool() method has code for parsing a string to set a bool value, as does the qemu-option.c parse_option_bool() method, except it handles fewer cases. To enable consistency across the codebase, extend parse_option_bool() to handle "yes", "no", "y" and "n", and make it non-static. Convert the opts visitor to call this method directly. Also make parse_option_number() non-static to allow for similar reuse later. Reviewed-by: Kevin WolfReviewed-by: Marc-André Lureau Reviewed-by: Eric Blake Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrange --- include/qemu/option.h | 4 qapi/opts-visitor.c | 19 +-- tests/qemu-iotests/051.out| 6 +++--- tests/qemu-iotests/051.pc.out | 6 +++--- tests/qemu-iotests/137.out| 4 ++-- util/qemu-option.c| 27 --- 6 files changed, 29 insertions(+), 37 deletions(-) diff --git a/include/qemu/option.h b/include/qemu/option.h index 1f9e3f9..2a5266f 100644 --- a/include/qemu/option.h +++ b/include/qemu/option.h @@ -37,6 +37,10 @@ int get_param_value(char *buf, int buf_size, const char *tag, const char *str); +void parse_option_bool(const char *name, const char *value, bool *ret, + Error **errp); +void parse_option_number(const char *name, const char *value, + uint64_t *ret, Error **errp); void parse_option_size(const char *name, const char *value, uint64_t *ret, Error **errp); bool has_help_option(const char *param); diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c index 1048bbc..084f7cc 100644 --- a/qapi/opts-visitor.c +++ b/qapi/opts-visitor.c @@ -334,7 +334,6 @@ opts_type_str(Visitor *v, const char *name, char **obj, Error **errp) } -/* mimics qemu-option.c::parse_option_bool() */ static void opts_type_bool(Visitor *v, const char *name, bool *obj, Error **errp) { @@ -346,23 +345,7 @@ opts_type_bool(Visitor *v, const char *name, bool *obj, Error **errp) return; } -if (opt->str) { -if (strcmp(opt->str, "on") == 0 || -strcmp(opt->str, "yes") == 0 || -strcmp(opt->str, "y") == 0) { -*obj = true; -} else if (strcmp(opt->str, "off") == 0 || -strcmp(opt->str, "no") == 0 || -strcmp(opt->str, "n") == 0) { -*obj = false; -} else { -error_setg(errp, QERR_INVALID_PARAMETER_VALUE, opt->name, - "on|yes|y|off|no|n"); -return; -} -} else { -*obj = true; -} +parse_option_bool(opt->name, opt->str, obj, errp); processed(ov, name); } diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out index 408d613..ffdc0b9 100644 --- a/tests/qemu-iotests/051.out +++ b/tests/qemu-iotests/051.out @@ -86,13 +86,13 @@ QEMU X.Y.Z monitor - type 'help' for more information (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts= -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: Parameter 'lazy-refcounts' expects 'on' or 'off' +QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n' Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42 -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: Parameter 'lazy-refcounts' expects 'on' or 'off' +QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n' Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo: Parameter 'lazy-refcounts' expects 'on' or 'off' +QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=foo: Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n' === With version 2 images enabling lazy refcounts must fail === diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out index ec6d222..6abf1ab 100644 --- a/tests/qemu-iotests/051.pc.out +++ b/tests/qemu-iotests/051.pc.out @@ -86,13 +86,13 @@ QEMU X.Y.Z monitor - type 'help' for more information (qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts= -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: Parameter 'lazy-refcounts' expects 'on' or 'off' +QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=: Parameter 'lazy-refcounts' expects 'on', 'yes', 'y', 'off', 'no' or 'n' Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42 -QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=42: Parameter
[Qemu-block] [PATCH v14 03/19] option: allow qemu_opts_to_qdict to merge repeated options
If given an option string such as size=1024,nodes=10,nodes=4-5,nodes=1-2,policy=bind the qemu_opts_to_qdict() method will currently overwrite the values for repeated option keys, so only the last value is in the returned dict: size=1024 nodes=1-2 policy=bind This adds the ability for the caller to ask that the repeated keys be turned into list indexes: size=1024 nodes.0=10 nodes.1=4-5 nodes.2=1-2 policy=bind Note that the conversion has no way of knowing whether any given key is expected to be a list upfront - it can only figure that out when seeing the first duplicated key. Thus the caller has to be prepared to deal with the fact that if a key 'foo' is a list, then the returned qdict may contain either 'foo' (if only a single instance of the key was seen) or 'foo.NN' (if multiple instances of the key were seen). Signed-off-by: Daniel P. Berrange--- blockdev.c | 7 --- include/qemu/option.h| 8 +++- monitor.c| 3 ++- qemu-img.c | 4 +++- qemu-io-cmds.c | 3 ++- qemu-io.c| 6 -- qemu-nbd.c | 3 ++- qom/object_interfaces.c | 3 ++- tests/test-qemu-opts.c | 40 tests/test-replication.c | 9 ++--- util/qemu-option.c | 41 ++--- 11 files changed, 110 insertions(+), 17 deletions(-) diff --git a/blockdev.c b/blockdev.c index 3010393..5ef3193 100644 --- a/blockdev.c +++ b/blockdev.c @@ -911,7 +911,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type) /* Get a QDict for processing the options */ bs_opts = qdict_new(); -qemu_opts_to_qdict(all_opts, bs_opts); +qemu_opts_to_qdict(all_opts, bs_opts, + QEMU_OPTS_REPEAT_POLICY_LAST); legacy_opts = qemu_opts_create(_legacy_drive_opts, NULL, 0, _abort); @@ -3758,8 +3759,8 @@ void hmp_drive_add_node(Monitor *mon, const char *optstr) return; } -qdict = qemu_opts_to_qdict(opts, NULL); - +qdict = qemu_opts_to_qdict(opts, NULL, + QEMU_OPTS_REPEAT_POLICY_LAST); if (!qdict_get_try_str(qdict, "node-name")) { QDECREF(qdict); error_report("'node-name' needs to be specified"); diff --git a/include/qemu/option.h b/include/qemu/option.h index 2a5266f..328c468 100644 --- a/include/qemu/option.h +++ b/include/qemu/option.h @@ -125,7 +125,13 @@ void qemu_opts_set_defaults(QemuOptsList *list, const char *params, int permit_abbrev); QemuOpts *qemu_opts_from_qdict(QemuOptsList *list, const QDict *qdict, Error **errp); -QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict); +typedef enum { +QEMU_OPTS_REPEAT_POLICY_LAST, +QEMU_OPTS_REPEAT_POLICY_LIST, +} QemuOptsRepeatPolicy; + +QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict, + QemuOptsRepeatPolicy repeatPolicy); void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp); typedef int (*qemu_opts_loopfunc)(void *opaque, QemuOpts *opts, Error **errp); diff --git a/monitor.c b/monitor.c index 83c4edf..7dcd66b 100644 --- a/monitor.c +++ b/monitor.c @@ -2642,7 +2642,8 @@ static QDict *monitor_parse_arguments(Monitor *mon, if (!opts) { goto fail; } -qemu_opts_to_qdict(opts, qdict); +qemu_opts_to_qdict(opts, qdict, + QEMU_OPTS_REPEAT_POLICY_LAST); qemu_opts_del(opts); } break; diff --git a/qemu-img.c b/qemu-img.c index ceffefe..b399ae5 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -273,7 +273,9 @@ static BlockBackend *img_open_opts(const char *optstr, QDict *options; Error *local_err = NULL; BlockBackend *blk; -options = qemu_opts_to_qdict(opts, NULL); +options = qemu_opts_to_qdict(opts, NULL, + QEMU_OPTS_REPEAT_POLICY_LAST); + blk = blk_new_open(NULL, NULL, options, flags, _err); if (!blk) { error_reportf_err(local_err, "Could not open '%s': ", optstr); diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c index 3a3838a..e14beed 100644 --- a/qemu-io-cmds.c +++ b/qemu-io-cmds.c @@ -1952,7 +1952,8 @@ static int reopen_f(BlockBackend *blk, int argc, char **argv) } qopts = qemu_opts_find(_opts, NULL); -opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : NULL; +opts = qopts ? qemu_opts_to_qdict(qopts, NULL, + QEMU_OPTS_REPEAT_POLICY_LAST) : NULL; qemu_opts_reset(_opts); brq = bdrv_reopen_queue(NULL, bs, opts, flags); diff --git a/qemu-io.c b/qemu-io.c index db129ea..b295afa 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -207,7 +207,8 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
- Original Message - > From: "Denis V. Lunev"> To: "Paolo Bonzini" > Cc: "Vladimir Sementsov-Ogievskiy" , > qemu-de...@nongnu.org, qemu-block@nongnu.org, > nbd-gene...@lists.sourceforge.net, a...@alex.org.uk, ebl...@redhat.com, > kw...@redhat.com, stefa...@redhat.com, > w...@uter.be > Sent: Tuesday, September 27, 2016 12:25:54 PM > Subject: Re: [PATCH] proto: add 'shift' extension. > > On 09/27/2016 01:15 PM, Paolo Bonzini wrote: > >> We could go in a different direction and export flag > >> 'has_zero_init' which will report that the storage is > >> initialized with all zeroes at the moment. In this > >> case mirroring code will not fall into this > >> branch. > > Why don't you add the zero_init flag to QEMU's NBD driver instead? > > for all cases without knowing real backend on the server side? > I think this would be wrong. Add it to the command line, and leave it to libvirt or the user to pass "-drive file.driver=nbd,...,file.zero-init=on". Paolo
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/26/2016 03:46 PM, Vladimir Sementsov-Ogievskiy wrote: > This extension allows big requests for TRIM and WRITE_ZEROES through > special 'shift' parameter, which means that offset and length should be > shifted left by several bits. > > This is needed for efficient clearing large regions of the disk (up to > the whole disk). > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > > Here mentioned WRITE_ZEROES command which is only an experemental > extension for now. > > To dicuss: > NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some >reserved bits needed here? > > doc/proto.md | 19 ++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/doc/proto.md b/doc/proto.md > index 2de3a6a..6fd1b16 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -682,6 +682,8 @@ The field has the following format: >experimental `WRITE_ZEROES` > [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). > - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY` > > [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). > +- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and > + `NBD_OPT_SHIFT` > > Clients SHOULD ignore unknown flags. > > @@ -765,6 +767,15 @@ of the newstyle negotiation. > > Defined by the experimental `INFO` > [extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md). > > +- `NBD_OPT_SHIFT` (10) > + > +Defines shift used to calculate request offset and length if > +`NBD_CMD_FLAG_SHIFT` is set. > + > +Data: > + > +- 32 bits, shift (unsigned); Must not be larger than 32. > + > Option reply types > > These values are used in the "reply type" field, sent by the server > @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake > phase. > > [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). > - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY` > > [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). > - > +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and > + `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request > + *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT` > + option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is > + not specified. If after shift `(offset + length)` exceeds disk size, length > + should be reduced to `( - offset)`. However, `(offset + length)` > + must not exceed disk size by more than `(1 << N) - 1`. > > Request types > We could go in a different direction and export flag 'has_zero_init' which will report that the storage is initialized with all zeroes at the moment. In this case mirroring code will not fall into this branch. Den
Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit
On Tue, 09/27 13:15, Kevin Wolf wrote: > Am 27.09.2016 um 13:04 hat Fam Zheng geschrieben: > > On Tue, 09/27 17:33, Fam Zheng wrote: > > > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > > > commit', but didn't turn on the "unmap" in the active commit job. This > > > patch fixes that so that zeroed clusters in top image can be discarded > > > which is desired in the virt-sparsify use case, where a temporary > > > overlay is created and fstrim'ed before commiting back, to free space in > > > the original image. > > > > > > This also enables it for block-commit. > > > > > > Signed-off-by: Fam Zheng> > > --- > > > v2: Add "unmap" to block-commit as well. [Kevin] > > > --- > > > block/mirror.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/block/mirror.c b/block/mirror.c > > > index f9d1fec..8f6f506 100644 > > > --- a/block/mirror.c > > > +++ b/block/mirror.c > > > @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, > > > BlockDriverState *bs, > > > mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, > > > MIRROR_LEAVE_BACKING_CHAIN, > > > on_error, on_error, false, cb, opaque, _err, > > > - _active_job_driver, false, base, > > > auto_complete); > > > + _active_job_driver, true, base, > > > auto_complete); > > > > I'm an idiot! I changed the wrong parameter (is_none_mode). > > Actually, the idiotic part is having a function with eighteen > parameters. Not too surprising that someone confuses them sooner or > later... > > Perhaps we could enable QAPI boxing for blockdev-mirror and then just > pass down a struct. Then we could have designated initialisers here in > commit_active_start() and it would be obvious if the wrong one is > changed. Good, one more thing in my TODO list. Thanks. Fam > > Not something to do in this series, of course, just a general idea for > improvement. > > Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit
Am 27.09.2016 um 13:04 hat Fam Zheng geschrieben: > On Tue, 09/27 17:33, Fam Zheng wrote: > > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > > commit', but didn't turn on the "unmap" in the active commit job. This > > patch fixes that so that zeroed clusters in top image can be discarded > > which is desired in the virt-sparsify use case, where a temporary > > overlay is created and fstrim'ed before commiting back, to free space in > > the original image. > > > > This also enables it for block-commit. > > > > Signed-off-by: Fam Zheng> > --- > > v2: Add "unmap" to block-commit as well. [Kevin] > > --- > > block/mirror.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/mirror.c b/block/mirror.c > > index f9d1fec..8f6f506 100644 > > --- a/block/mirror.c > > +++ b/block/mirror.c > > @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, > > BlockDriverState *bs, > > mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, > > MIRROR_LEAVE_BACKING_CHAIN, > > on_error, on_error, false, cb, opaque, _err, > > - _active_job_driver, false, base, > > auto_complete); > > + _active_job_driver, true, base, auto_complete); > > I'm an idiot! I changed the wrong parameter (is_none_mode). Actually, the idiotic part is having a function with eighteen parameters. Not too surprising that someone confuses them sooner or later... Perhaps we could enable QAPI boxing for blockdev-mirror and then just pass down a struct. Then we could have designated initialisers here in commit_active_start() and it would be obvious if the wrong one is changed. Not something to do in this series, of course, just a general idea for improvement. Kevin
[Qemu-block] [PATCH v3] block: Turn on "unmap" in active commit
We already specified BDRV_O_UNMAP when opening images in 'qemu-img commit', but didn't turn on the "unmap" in the active commit job. This patch fixes that so that zeroed clusters in top image can be discarded which is desired in the virt-sparsify use case, where a temporary overlay is created and fstrim'ed before commiting back, to free space in the original image. This also enables it for block-commit. Signed-off-by: Fam Zheng--- v3: Change the right parameter. v2: Add "unmap" to block-commit as well. [Kevin] --- block/mirror.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mirror.c b/block/mirror.c index f9d1fec..8847ec5 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1042,7 +1042,7 @@ void commit_active_start(const char *job_id, BlockDriverState *bs, mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, MIRROR_LEAVE_BACKING_CHAIN, - on_error, on_error, false, cb, opaque, _err, + on_error, on_error, true, cb, opaque, _err, _active_job_driver, false, base, auto_complete); if (local_err) { error_propagate(errp, local_err); -- 2.7.4
Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit
On Tue, 09/27 17:33, Fam Zheng wrote: > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > commit', but didn't turn on the "unmap" in the active commit job. This > patch fixes that so that zeroed clusters in top image can be discarded > which is desired in the virt-sparsify use case, where a temporary > overlay is created and fstrim'ed before commiting back, to free space in > the original image. > > This also enables it for block-commit. > > Signed-off-by: Fam Zheng> --- > v2: Add "unmap" to block-commit as well. [Kevin] > --- > block/mirror.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/mirror.c b/block/mirror.c > index f9d1fec..8f6f506 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, > BlockDriverState *bs, > mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, > MIRROR_LEAVE_BACKING_CHAIN, > on_error, on_error, false, cb, opaque, _err, > - _active_job_driver, false, base, auto_complete); > + _active_job_driver, true, base, auto_complete); I'm an idiot! I changed the wrong parameter (is_none_mode). Will send v3. Fam > if (local_err) { > error_propagate(errp, local_err); > goto error_restore_flags; > -- > 2.7.4 > >
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/27/2016 01:15 PM, Paolo Bonzini wrote: >> We could go in a different direction and export flag >> 'has_zero_init' which will report that the storage is >> initialized with all zeroes at the moment. In this >> case mirroring code will not fall into this >> branch. > Why don't you add the zero_init flag to QEMU's NBD driver instead? > > Thanks, > > Paolo for all cases without knowing real backend on the server side? I think this would be wrong. Den
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
> We could go in a different direction and export flag > 'has_zero_init' which will report that the storage is > initialized with all zeroes at the moment. In this > case mirroring code will not fall into this > branch. Why don't you add the zero_init flag to QEMU's NBD driver instead? Thanks, Paolo
Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Turn on "unmap" in active commit
On Tue, 09/27 17:33, Fam Zheng wrote: > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > commit', but didn't turn on the "unmap" in the active commit job. This > patch fixes that so that zeroed clusters in top image can be discarded > which is desired in the virt-sparsify use case, where a temporary > overlay is created and fstrim'ed before commiting back, to free space in > the original image. > > This also enables it for block-commit. > > Signed-off-by: Fam ZhengSent too soon. I see a number of iotests failures (020 040 097 129), still investigating. Fam
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
Am 27.09.2016 um 10:16 schrieb Kevin Wolf: Am 27.09.2016 um 04:02 hat Fam Zheng geschrieben: On Mon, 09/26 16:30, Peter Lieven wrote: So it looks like this file is not compliant to the specfication. Is it recognized and imported by VMware? Yes, VMware ESXi imports it flawlessly. The only possibility is that the "footer" is not in the end of the image. There must be some other sector begins with bytes '4b 44 4d 56' besides the first. Theoretically we can make QEMU accept this case by "scanning" the whole image to look for the footer, but I'm not sure it's worth it. I would be surprised if VMware were scanning the whole image, so there must be some other way. If such images exist in real life, we should probably try to support them. There is no second magic in the whole file. Is it possible that VMware restores the footer from the info in the descriptor? $ hexdump -C PI-VA-3.1.0.0.132-flex-disk1.vmdk | grep "4b 44 4d 56" 4b 44 4d 56 03 00 00 00 01 00 03 00 00 00 9f 24 |KDMV...$| $ Peter
[Qemu-block] [PATCH v2] block: Turn on "unmap" in active commit
We already specified BDRV_O_UNMAP when opening images in 'qemu-img commit', but didn't turn on the "unmap" in the active commit job. This patch fixes that so that zeroed clusters in top image can be discarded which is desired in the virt-sparsify use case, where a temporary overlay is created and fstrim'ed before commiting back, to free space in the original image. This also enables it for block-commit. Signed-off-by: Fam Zheng--- v2: Add "unmap" to block-commit as well. [Kevin] --- block/mirror.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mirror.c b/block/mirror.c index f9d1fec..8f6f506 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1043,7 +1043,7 @@ void commit_active_start(const char *job_id, BlockDriverState *bs, mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, MIRROR_LEAVE_BACKING_CHAIN, on_error, on_error, false, cb, opaque, _err, - _active_job_driver, false, base, auto_complete); + _active_job_driver, true, base, auto_complete); if (local_err) { error_propagate(errp, local_err); goto error_restore_flags; -- 2.7.4
Re: [Qemu-block] [PATCH v3 1/3] qemu-nbd: Add --fork option
Am 25.08.2016 um 18:30 hat Max Reitz geschrieben: > Using the --fork option, one can make qemu-nbd fork the worker process. > The original process will exit on error of the worker or once the worker > enters the main loop. > > Suggested-by: Sascha Silbe> Signed-off-by: Max Reitz > --- > qemu-nbd.c | 15 ++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/qemu-nbd.c b/qemu-nbd.c > index e3571c2..8c2d582 100644 > --- a/qemu-nbd.c > +++ b/qemu-nbd.c > @@ -48,6 +48,7 @@ > #define QEMU_NBD_OPT_OBJECT260 > #define QEMU_NBD_OPT_TLSCREDS 261 > #define QEMU_NBD_OPT_IMAGE_OPTS262 > +#define QEMU_NBD_OPT_FORK 263 > > #define MBR_SIZE 512 > > @@ -503,6 +504,7 @@ int main(int argc, char **argv) > { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, > { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS }, > { "trace", required_argument, NULL, 'T' }, > +{ "fork", no_argument, NULL, QEMU_NBD_OPT_FORK }, > { NULL, 0, NULL, 0 } > }; > int ch; > @@ -524,6 +526,8 @@ int main(int argc, char **argv) > bool imageOpts = false; > bool writethrough = true; > char *trace_file = NULL; > +bool fork_process = false; > +int old_stderr = -1; > > /* The client thread uses SIGTERM to interrupt the server. A signal > * handler ensures that "qemu-nbd -v -c" exits with a nice status code. > @@ -714,6 +718,9 @@ int main(int argc, char **argv) > g_free(trace_file); > trace_file = trace_opt_parse(optarg); > break; > +case QEMU_NBD_OPT_FORK: > +fork_process = true; > +break; > } > } > > @@ -773,7 +780,7 @@ int main(int argc, char **argv) > return 0; > } > > -if (device && !verbose) { > +if ((device && !verbose) || fork_process) { > int stderr_fd[2]; > pid_t pid; > int ret; > @@ -796,6 +803,7 @@ int main(int argc, char **argv) > ret = qemu_daemon(1, 0); > > /* Temporarily redirect stderr to the parent's pipe... */ > +old_stderr = dup(STDERR_FILENO); > dup2(stderr_fd[1], STDERR_FILENO); > if (ret < 0) { > error_report("Failed to daemonize: %s", strerror(errno)); I don't have a kernel with NBD support, so I can't test this easily, but doesn't this break the daemon mode for --connect? I mean, it will still fork, but won't the parent be alive until the child exits? > @@ -951,6 +959,11 @@ int main(int argc, char **argv) > exit(EXIT_FAILURE); > } > > +if (fork_process) { > +dup2(old_stderr, STDERR_FILENO); > +close(old_stderr); > +} Because this code doesn't run for --connect (unless --fork is given, too). > state = RUNNING; > do { > main_loop_wait(false); The documentation needs an update, too. Kevin
Re: [Qemu-block] [PATCH] qemu-img: Allow unmap backing image for zeroed clusters
On Tue, 09/27 10:41, Kevin Wolf wrote: > Am 27.09.2016 um 04:20 hat Fam Zheng geschrieben: > > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > > commit', but didn't turn on the "unmap" in the active commit job. This > > patch fixes that so that zeroed clusters in top image can be discarded > > which is desired in the virt-sparsify use case, where a temporary > > overlay is created and fstrim'ed before commiting back, to free space in > > the original image. > > > > The block-commit keeps the existing behavior. > > > > Signed-off-by: Fam Zheng> > Is there a good reason for not using discard in an active commit if the > image was opened with BDRV_O_UNMAP? That is, wouldn't it make sense to > just set the unmap option for the mirror unconditionally and also for > block-commit? > > If there is a reason, we should probably add an option to block-commit, > but I still feels that enabling it should be the default then. I didn't touch block-commit because I wasn't sure, but you made a good point, we can turn this on there too. Fam
Re: [Qemu-block] [PATCH] qemu-img: Allow unmap backing image for zeroed clusters
Am 27.09.2016 um 04:20 hat Fam Zheng geschrieben: > We already specified BDRV_O_UNMAP when opening images in 'qemu-img > commit', but didn't turn on the "unmap" in the active commit job. This > patch fixes that so that zeroed clusters in top image can be discarded > which is desired in the virt-sparsify use case, where a temporary > overlay is created and fstrim'ed before commiting back, to free space in > the original image. > > The block-commit keeps the existing behavior. > > Signed-off-by: Fam ZhengIs there a good reason for not using discard in an active commit if the image was opened with BDRV_O_UNMAP? That is, wouldn't it make sense to just set the unmap option for the mirror unconditionally and also for block-commit? If there is a reason, we should probably add an option to block-commit, but I still feels that enabling it should be the default then. Kevin
Re: [Qemu-block] [PATCH v2 0/7] block: Make more blockdev-add options work
Am 23.09.2016 um 16:32 hat Kevin Wolf geschrieben: > This series moves a few more options from flags to the appropriate place. This > doesn't only result in cleaner code, but also allows using these options in > nested node definitions. > > After this series, bds_tree_init() is only a thin wrapper around bdrv_open() > which sets the right defaults for cache modes and the BDRV_O_INACTIVE flag if > necessary. > > Depends on my block branch, specifically Berto's 'read-only' patches. Applied to the block branch. Kevin
Re: [Qemu-block] [PATCH v2 0/7] block: Make more blockdev-add options work
Am 26.09.2016 um 19:10 hat Max Reitz geschrieben: > It's a bit funny now how blockdev_init() and > extract_common_blockdev_options() are actually used by neither > blockdev-add nor -blockdev, but only by drive_new() (which happily > advertises blockdev_init() to be shared with blockev-add, though). Yes. Essentially drive_new() is now the clean wrappers that map legacy syntax to the modern options, whereas blockdev_init() is left as the place for the messy stuff. Maybe we can get rid of it sooner or later. Kevin pgpoL4iN9pEYc.pgp Description: PGP signature
Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver
Am 26.09.2016 um 18:49 hat Max Reitz geschrieben: > On 23.09.2016 16:32, Kevin Wolf wrote: > > The option whether or not to use a native AIO interface really isn't a > > generic option for all drivers, but only applies to the native file > > protocols. This patch moves the option in blockdev-add to the > > appropriate places (raw-posix and raw-win32). > > > > We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility > > because so far the AIO option was usually specified on the wrong layer > > (the top-level format driver, which didn't even look at it) and then > > inherited by the protocol driver (where it was actually used). We can't > > forbid this use except in new interfaces. > > > > Signed-off-by: Kevin Wolf> > --- > > block/raw-posix.c | 44 --- > > block/raw-win32.c | 56 > > +- > > qapi/block-core.json | 6 +++--- > > tests/qemu-iotests/087 | 4 ++-- > > 4 files changed, 83 insertions(+), 27 deletions(-) > > [...] > > > diff --git a/block/raw-win32.c b/block/raw-win32.c > > index 56f45fe..734bb10 100644 > > --- a/block/raw-win32.c > > +++ b/block/raw-win32.c > > [...] > > > +static bool get_aio_option(QemuOpts *opts, int flags, Error **errp) > > +{ > > +BlockdevAioOptions aio, aio_default; > > + > > +aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE > > + : > > BLOCKDEV_AIO_OPTIONS_THREADS; > > +aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, > > "aio"), > > + BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp); > > + > > +switch (aio) { > > +case BLOCKDEV_AIO_OPTIONS_NATIVE: > > +return true; > > +case BLOCKDEV_AIO_OPTIONS_THREADS: > > +return false; > > +default: > > +error_setg(errp, "Invalid AIO option"); > > Any reason for catching this case here but not in raw-posix? > > (Not that it really matters, though.) Nobody will forget raw-posix when adding a new AIO mode to win32, so I didn't feel like the additional code was worth it there. But if we add a new AIO mode to raw-posix, I'm pretty sure we will forget win32. Kevin pgpzdSa6z_E2f.pgp Description: PGP signature
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
Am 27.09.2016 um 04:02 hat Fam Zheng geschrieben: > On Mon, 09/26 16:30, Peter Lieven wrote: > > > So it looks like this file is not compliant to the specfication. Is it > > > recognized and imported by VMware? > > > > Yes, VMware ESXi imports it flawlessly. > > The only possibility is that the "footer" is not in the end of the image. > There > must be some other sector begins with bytes '4b 44 4d 56' besides the first. > Theoretically we can make QEMU accept this case by "scanning" the whole image > to look for the footer, but I'm not sure it's worth it. I would be surprised if VMware were scanning the whole image, so there must be some other way. If such images exist in real life, we should probably try to support them. Kevin
Re: [Qemu-block] [Nbd] [PATCH] proto: add 'shift' extension.
> On 27 Sep 2016, at 00:41, Wouter Verhelstwrote: > > Thoughts? Honestly? This whole thing seems like complication for little gain. One command per 2GB wiped doesn't seem like a bad thing. -- Alex Bligh
[Qemu-block] [PATCH v2 5/5] block: keep AioContext pointer in BlockBackend
From: Stefan Hajnocziblk_get/set_aio_context() delegate to BlockDriverState without storing the AioContext pointer in BlockBackend. There are two flaws: 1. BlockBackend falls back to the QEMU main loop AioContext when there is no root BlockDriverState. This means the drive loses its AioContext during media change and would break dataplane. 2. BlockBackend state used from multiple threads has no lock. Race conditions will creep in as functionality is moved from BlockDriverState to BlockBackend due to the absense of a lock. The monitor cannot access BlockBackend state safely while an IOThread is also accessing the state. Issue #1 can be triggered by "change" on virtio-scsi dataplane, causing a assertion failure (virtio-blk is fine because medium change is not possible). #2 may be possible with block accounting statistics in BlockBackend but I'm not aware of a crash that can be triggered. This patch stores the AioContext pointer in BlockBackend and puts newly inserted BlockDriverStates into the AioContext. Signed-off-by: Stefan Hajnoczi Signed-off-by: Fam Zheng --- block/block-backend.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index b71babe..cda67cc 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -31,6 +31,7 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb); struct BlockBackend { char *name; int refcnt; +AioContext *aio_context; BdrvChild *root; DriveInfo *legacy_dinfo;/* null unless created by drive_new() */ QTAILQ_ENTRY(BlockBackend) link; /* for block_backends */ @@ -121,6 +122,7 @@ static BlockBackend *blk_new_with_ctx(AioContext *ctx) blk = g_new0(BlockBackend, 1); blk->refcnt = 1; +blk->aio_context = ctx; blk_set_enable_write_cache(blk, true); qemu_co_queue_init(>public.throttled_reqs[0]); @@ -510,6 +512,8 @@ void blk_remove_bs(BlockBackend *blk) void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs) { bdrv_ref(bs); + +assert(blk->aio_context == bdrv_get_aio_context(bs)); blk->root = bdrv_root_attach_child(bs, "root", _root, blk); notifier_list_notify(>insert_bs_notifiers, blk); @@ -1413,13 +1417,7 @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason) AioContext *blk_get_aio_context(BlockBackend *blk) { -BlockDriverState *bs = blk_bs(blk); - -if (bs) { -return bdrv_get_aio_context(bs); -} else { -return qemu_get_aio_context(); -} +return blk->aio_context; } static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb) @@ -1432,7 +1430,19 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context) { BlockDriverState *bs = blk_bs(blk); +blk->aio_context = new_context; + if (bs) { +AioContext *ctx = bdrv_get_aio_context(bs); + +if (ctx == new_context) { +return; +} +/* Moving context around happens when a block device is + * enabling/disabling data plane, in which case we own the root BDS and + * it cannot be associated with another AioContext. */ +assert(ctx == qemu_get_aio_context() || + new_context == qemu_get_aio_context()); if (blk->public.throttle_state) { throttle_timers_detach_aio_context(>public.throttle_timers); } -- 2.7.4
[Qemu-block] [PATCH v2 3/5] block: Introduce and make use of blk_new_with_root
The idiom of "blk_new() + blk_insert_bs()" is common. Use a new BB interface that does both things to make the coming transition that adds AioContext to BB easier. The added blk_new_with_ctx doesn't handle the passed ctx and is purely here to avoid a small code churn. Signed-off-by: Fam Zheng--- block/backup.c | 3 +-- block/block-backend.c| 24 ++-- block/commit.c | 12 block/mirror.c | 3 +-- blockjob.c | 3 +-- hmp.c| 3 +-- hw/core/qdev-properties-system.c | 3 +-- include/sysemu/block-backend.h | 1 + nbd/server.c | 3 +-- tests/test-blockjob.c| 3 +-- 10 files changed, 30 insertions(+), 28 deletions(-) diff --git a/block/backup.c b/block/backup.c index 582bd0f..3eed9a1 100644 --- a/block/backup.c +++ b/block/backup.c @@ -601,8 +601,7 @@ void backup_start(const char *job_id, BlockDriverState *bs, goto error; } -job->target = blk_new(); -blk_insert_bs(job->target, target); +job->target = blk_new_with_root(target); job->on_source_error = on_source_error; job->on_target_error = on_target_error; diff --git a/block/block-backend.c b/block/block-backend.c index 0bd19ab..b71babe 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -115,12 +115,7 @@ static const BdrvChildRole child_root = { .drained_end= blk_root_drained_end, }; -/* - * Create a new BlockBackend with a reference count of one. - * Store an error through @errp on failure, unless it's null. - * Return the new BlockBackend on success, null on failure. - */ -BlockBackend *blk_new(void) +static BlockBackend *blk_new_with_ctx(AioContext *ctx) { BlockBackend *blk; @@ -139,6 +134,23 @@ BlockBackend *blk_new(void) } /* + * Create a new BlockBackend with a reference count of one. + * Store an error through @errp on failure, unless it's null. + * Return the new BlockBackend on success, null on failure. + */ +BlockBackend *blk_new(void) +{ +return blk_new_with_ctx(qemu_get_aio_context()); +} + +BlockBackend *blk_new_with_root(BlockDriverState *root) +{ +BlockBackend *blk = blk_new_with_ctx(bdrv_get_aio_context(root)); +blk_insert_bs(blk, root); +return blk; +} + +/* * Creates a new BlockBackend, opens a new BlockDriverState, and connects both. * * Just as with bdrv_open(), after having called this function the reference to diff --git a/block/commit.c b/block/commit.c index 9f67a8b..45d4f96 100644 --- a/block/commit.c +++ b/block/commit.c @@ -260,11 +260,9 @@ void commit_start(const char *job_id, BlockDriverState *bs, } -s->base = blk_new(); -blk_insert_bs(s->base, base); +s->base = blk_new_with_root(base); -s->top = blk_new(); -blk_insert_bs(s->top, top); +s->top = blk_new_with_root(top); s->active = bs; @@ -314,11 +312,9 @@ int bdrv_commit(BlockDriverState *bs) } } -src = blk_new(); -blk_insert_bs(src, bs); +src = blk_new_with_root(bs); -backing = blk_new(); -blk_insert_bs(backing, bs->backing->bs); +backing = blk_new_with_root(bs->backing->bs); length = blk_getlength(src); if (length < 0) { diff --git a/block/mirror.c b/block/mirror.c index f9d1fec..8910d53 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -941,8 +941,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs, return; } -s->target = blk_new(); -blk_insert_bs(s->target, target); +s->target = blk_new_with_root(target); s->replaces = g_strdup(replaces); s->on_source_error = on_source_error; diff --git a/blockjob.c b/blockjob.c index a167f96..8fe6d5d 100644 --- a/blockjob.c +++ b/blockjob.c @@ -148,8 +148,7 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, return NULL; } -blk = blk_new(); -blk_insert_bs(blk, bs); +blk = blk_new_with_root(bs); job = g_malloc0(driver->instance_size); error_setg(>blocker, "block device is in use by block job: %s", diff --git a/hmp.c b/hmp.c index 336e7bf..1ee83d2 100644 --- a/hmp.c +++ b/hmp.c @@ -1934,8 +1934,7 @@ void hmp_qemu_io(Monitor *mon, const QDict *qdict) if (!blk) { BlockDriverState *bs = bdrv_lookup_bs(NULL, device, ); if (bs) { -blk = local_blk = blk_new(); -blk_insert_bs(blk, bs); +blk = local_blk = blk_new_with_root(bs); } else { goto fail; } diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index e55afe6..2b07beb 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -78,8 +78,7 @@ static void parse_drive(DeviceState *dev, const char *str, void **ptr, if (!blk) { BlockDriverState *bs = bdrv_lookup_bs(NULL, str, NULL); if (bs) { -
[Qemu-block] [PATCH v2 4/5] migration: Set correct AioContext to BlockBackend
The BB is newly created and it should follow the BDS's current context. Signed-off-by: Fam Zheng--- migration/block.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/block.c b/migration/block.c index ebc10e6..b7365ee 100644 --- a/migration/block.c +++ b/migration/block.c @@ -445,6 +445,7 @@ static void init_blk_migration(QEMUFile *f) BlockDriverState *bs = bmds_bs[i].bs; if (bmds) { +blk_set_aio_context(bmds->blk, bdrv_get_aio_context(bs)); blk_insert_bs(bmds->blk, bs); alloc_aio_bitmap(bmds); -- 2.7.4
[Qemu-block] [PATCH v2 2/5] blockdev: Move BDS AioContext before inserting to BB
The callers made sure the BDS has no BB attached before, so blk is becoming the sole owner. It is necessary to move the BDS to the right AioContext before inserting it, to keep them in sync. Signed-off-by: Fam Zheng--- blockdev.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/blockdev.c b/blockdev.c index a4960b9..9700dee 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2478,6 +2478,7 @@ out: static void qmp_blockdev_insert_anon_medium(BlockBackend *blk, BlockDriverState *bs, Error **errp) { +AioContext *ctx; bool has_device; /* For BBs without a device, we can exchange the BDS tree at will */ @@ -2498,6 +2499,12 @@ static void qmp_blockdev_insert_anon_medium(BlockBackend *blk, return; } +ctx = blk_get_aio_context(blk); +if (ctx != bdrv_get_aio_context(bs)) { +aio_context_acquire(ctx); +bdrv_set_aio_context(bs, ctx); +aio_context_release(ctx); +} blk_insert_bs(blk, bs); if (!blk_dev_has_tray(blk)) { -- 2.7.4
[Qemu-block] [PATCH v2 0/5] block: keep AioContext pointer in BlockBackend
The first patches clean up usage of BlockBackend and changing of its (root's) aio contexts; the last patch is an update of Stefan's previous version rebasing on top of current master. The biggest change from the RFC is that blk_insert_bs callers are responsible to put the BB and BDS on the same context before calling it. This fixes the crash triggered by "change" a scsi-cd on a virtio-scsi dataplane device. The new assertions in block-backend.c ensures we won't have a conflict pair of BlockBackend users from different contextes. Fam Zheng (4): blockdev-mirror: Sanity check before moving target_bs AioContext blockdev: Move BDS AioContext before inserting to BB block: Introduce and make use of blk_new_with_root migration: Set correct AioContext to BlockBackend Stefan Hajnoczi (1): block: keep AioContext pointer in BlockBackend block/backup.c | 3 +-- block/block-backend.c| 48 +--- block/commit.c | 12 -- block/mirror.c | 3 +-- blockdev.c | 42 ++- blockjob.c | 3 +-- hmp.c| 3 +-- hw/core/qdev-properties-system.c | 3 +-- include/sysemu/block-backend.h | 1 + migration/block.c| 1 + nbd/server.c | 3 +-- tests/test-blockjob.c| 3 +-- 12 files changed, 79 insertions(+), 46 deletions(-) -- 2.7.4
[Qemu-block] [PATCH v2 1/5] blockdev-mirror: Sanity check before moving target_bs AioContext
Similar to blockdev-backup, if the target was already moved to a different AioContext, bad things can happen. This happens when the target belongs to a data plane device. It's a very unlikely case, but let's check it anyway. Signed-off-by: Fam Zheng--- blockdev.c | 35 --- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/blockdev.c b/blockdev.c index 29c6561..a4960b9 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3281,6 +3281,21 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp) return bdrv_named_nodes_list(errp); } +static void blockdev_set_aio_context(BlockDriverState *bs, AioContext *ctx, + Error **errp) +{ +if (bdrv_get_aio_context(bs) != ctx) { +if (!bdrv_has_blk(bs)) { +/* The target BDS is not attached, we can safely move it to another + * AioContext. */ +bdrv_set_aio_context(bs, ctx); +} else { +error_setg(errp, "Target is attached to a different thread from " + "source."); +} +} +} + void do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn, Error **errp) { BlockDriverState *bs; @@ -3317,16 +3332,10 @@ void do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn, Error **errp) goto out; } -if (bdrv_get_aio_context(target_bs) != aio_context) { -if (!bdrv_has_blk(target_bs)) { -/* The target BDS is not attached, we can safely move it to another - * AioContext. */ -bdrv_set_aio_context(target_bs, aio_context); -} else { -error_setg(errp, "Target is attached to a different thread from " - "source."); -goto out; -} +blockdev_set_aio_context(target_bs, aio_context, _err); +if (local_err) { +error_propagate(errp, local_err); +goto out; } backup_start(backup->job_id, bs, target_bs, backup->speed, backup->sync, NULL, backup->compress, backup->on_source_error, @@ -3538,7 +3547,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp) goto out; } -bdrv_set_aio_context(target_bs, aio_context); +blockdev_set_aio_context(target_bs, aio_context, _err); +if (local_err) { +error_propagate(errp, local_err); +goto out; +} blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs, arg->has_replaces, arg->replaces, arg->sync, -- 2.7.4