Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Thu, Jul 11, 2024 at 11:44:08AM -0300, Fabiano Rosas wrote: >> But of course, that means we cannot claim to support all kinds of >> forward migrations anymore. Only those in the 6 year period. > > That "6 years" comes from machine type deprecation period, and migration > compatibility is mostly only attached to machine types, and we only ever > allowed migration with the same machine type. > > It means, >6 years migration will never work anyway as soon as we start to > deprecate machine types (irrelevant of any reuse of UNUSED), because the >6 > years machine types will simply be gone.. See configuration_post_load(), > where it'll throw an error upfront when machine type mismatched. Yes, duh! What am I talking about...
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Thu, Jul 11, 2024 at 10:34:12AM -0300, Fabiano Rosas wrote: >> Is there an easy way to look at a field and tell in which machine type's >> timeframe it was introduced? > > I am not aware of any. > >> If the machine type of that era has been removed, then the field is free >> to go as well. I'd prefer if we had a hard link instead of just counting >> years. Maybe we should to that mapping at the machine deprecation time? >> As in, "look at the unused fields introduced in that timeframe and mark >> them free". > > We can do that, but depending on how easy it would be. That can be an > overkill to me if it's non-trivial. When it becomes complicated, I'd > rather make machine compat property easier to use so we always stick with > that. Currently it's not as easy to use. > > Maybe we shouldn't make it a common rule to let people reuse the UNUSED > fields, even if in this case it's probably fine? > > E.g. I don't think it's a huge deal to keep all UNUSED fields forever - > sending 512B zeros for only one specific device isn't an issue even if kept > forever. > > If "over 6 years" would be okay and simple enough, then maybe we can stick > with that (and only if people would like to reuse a field and ask; that's > after all not required..). If more than that I doubt whether we should > spend time working on covering all the fields. I'm fine with a simple rule. But of course, that means we cannot claim to support all kinds of forward migrations anymore. Only those in the 6 year period.
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Wed, Jul 10, 2024 at 06:38:26PM -0300, Fabiano Rosas wrote: >> Peter Xu writes: >> >> > On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote: >> >> Peter Xu writes: >> >> >> >> > On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote: >> >> >> It's not about trust, we simply don't support migrations other than >> >> >> n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included. >> >> > >> >> > Where does it come from? I thought we suppport that.. >> >> >> >> I'm taking that from: >> >> >> >> docs/devel/migration/main.rst: >> >> "In general QEMU tries to maintain forward migration compatibility >> >> (i.e. migrating from QEMU n->n+1) and there are users who benefit from >> >> backward compatibility as well." >> >> >> >> But of course it doesn't say whether that comes with a transitive rule >> >> allowing n->n+2 migrations. >> > >> > I'd say that "i.e." implies n->n+1 is not the only forward migration we >> > would support. >> > >> > I _think_ we should support all forward migration as long as the machine >> > type matches. >> > >> >> >> >> > >> >> > The same question would be: are we requesting an OpenStack cluster to >> >> > always upgrade QEMU with +1 versions, otherwise migration will fail? >> >> >> >> Will an OpenStack cluster be using upstream QEMU? If not, then that's a >> > >> > It's an example to show what I meant! :) Nothing else. Definitely not >> > saying that everyone should use an upstream released QEMU (but in reality, >> > it's not a problem, I think, and I do feel like people use them, perhaps >> > more with the stable releases). >> > >> >> question for the distro. In a very practical sense, we're not requesting >> >> anything. We barely test n->n+1/n->n-1, even if we had a strong support >> >> statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU >> >> 9.1 should succeed. >> > >> > No matter what we test in CI, I don't think we should break that for >1 >> > versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to >> > file a bug by anyone. >> > >> > For example, I randomly fetched a bug report: >> > >> > https://gitlab.com/qemu-project/qemu/-/issues/1937 >> > >> > QEMU version:6.2 and 7.2.5 >> > >> > And I believe that's the common case even for upstream. If we don't do >> > that right for upstream, it can be impossible tasks for downstream and for >> > all of us to maintain. >> >> But do we do that right currently? I have no idea. Have we ever done >> it? And we're here discussing a hypothetical 2.7->9.1 ... >> >> So we cannot reuse the UNUSED field because QEMU from 2016 might send >> their data and QEMU from 2024 would interpret it wrong. >> >> How do we proceed? Add a subsection. And make the code survive when >> receiving 0. >> >> @Peter is that it? What about backwards-compat? We'll need a property as >> well it seems. > > Compat property is definitely one way to go, but I think it's you who more > or less persuaded me that reusing it seems possible! At least I can't yet > think of anything bad if it's ancient unused buffers. Since we're allowing any old QEMU version to migrate to the most recent one, we need to think of the data that was there before the introduction of the UNUSED field. If that QEMU version is used, then it's not going to be zeroes on the stream, but whatever data was there before. The new QEMU will be expecting the vendor_data introduced in this patch. > And that's why I was asking about a sane way to describe the "magic > year".. And I was very serious when I said "6 years" to follow the > deprecation of machine types, because it'll be something we can follow > to say when an unused buffer can be obsolete and it could make some > sense: if we will start to deprecate machine types, then it may not > make sense to keep any UNUSED compatible forever, after all. > Is there an easy way to look at a field and tell in which machine type's timeframe it was introduced? If the machine type of that era has been removed, then the field is free to go as well. I'd prefer if we had a hard link instead of just counting years. Maybe we should to that mapping at the machine deprecation time? As in, "look at the unused fields introduced in that timeframe and mark them free". > And we need that ruler to be as accurate as "always 6 years to follow > machine type deprecation procedure", in case someone else tomorrow asks us > something that was only UNUSED since 2018. We need a rule of thumb if we > want to reuse it, and if all of you agree we can start with this one, > perhaps with a comment above the field (before we think all through and > make it a rule on deprecating UNUSED)?
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote: >> Peter Xu writes: >> >> > On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote: >> >> It's not about trust, we simply don't support migrations other than >> >> n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included. >> > >> > Where does it come from? I thought we suppport that.. >> >> I'm taking that from: >> >> docs/devel/migration/main.rst: >> "In general QEMU tries to maintain forward migration compatibility >> (i.e. migrating from QEMU n->n+1) and there are users who benefit from >> backward compatibility as well." >> >> But of course it doesn't say whether that comes with a transitive rule >> allowing n->n+2 migrations. > > I'd say that "i.e." implies n->n+1 is not the only forward migration we > would support. > > I _think_ we should support all forward migration as long as the machine > type matches. > >> >> > >> > The same question would be: are we requesting an OpenStack cluster to >> > always upgrade QEMU with +1 versions, otherwise migration will fail? >> >> Will an OpenStack cluster be using upstream QEMU? If not, then that's a > > It's an example to show what I meant! :) Nothing else. Definitely not > saying that everyone should use an upstream released QEMU (but in reality, > it's not a problem, I think, and I do feel like people use them, perhaps > more with the stable releases). > >> question for the distro. In a very practical sense, we're not requesting >> anything. We barely test n->n+1/n->n-1, even if we had a strong support >> statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU >> 9.1 should succeed. > > No matter what we test in CI, I don't think we should break that for >1 > versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to > file a bug by anyone. > > For example, I randomly fetched a bug report: > > https://gitlab.com/qemu-project/qemu/-/issues/1937 > > QEMU version:6.2 and 7.2.5 > > And I believe that's the common case even for upstream. If we don't do > that right for upstream, it can be impossible tasks for downstream and for > all of us to maintain. But do we do that right currently? I have no idea. Have we ever done it? And we're here discussing a hypothetical 2.7->9.1 ... So we cannot reuse the UNUSED field because QEMU from 2016 might send their data and QEMU from 2024 would interpret it wrong. How do we proceed? Add a subsection. And make the code survive when receiving 0. @Peter is that it? What about backwards-compat? We'll need a property as well it seems.
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote: >> It's not about trust, we simply don't support migrations other than >> n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included. > > Where does it come from? I thought we suppport that.. I'm taking that from: docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well." But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations. > > The same question would be: are we requesting an OpenStack cluster to > always upgrade QEMU with +1 versions, otherwise migration will fail? Will an OpenStack cluster be using upstream QEMU? If not, then that's a question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Wed, Jul 10, 2024 at 11:08:20AM -0300, Fabiano Rosas wrote: >> >> I think it's ok: >> >> >> >> { >> >> "field": "unused", >> >> "version_id": 1, >> >> "field_exists": false, >> >> "size": 512 >> >> }, >> >> >> >> vs. >> >> >> >> { >> >> "field": "vendor_data", >> >> "version_id": 0, >> >> "field_exists": false, >> >> "num": 512, >> >> "size": 1 >> >> }, >> >> >> >> The unused field was introduced in 2016 so there's no chance of >> >> migrating a QEMU that old to/from 9.1. >> > >> > What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the >> > new QEMU would consider it meaningful data? >> >> It will send zeros, no? The code will have to cope with that. The >> alternative is to put the vendor_data in a subsection and the code will >> also have to cope with the lack of data when the old QEMU doesn't send >> it. > > Ah indeed, that "static const uint8_t buf[1024]" is there at least since > 2017. So yes, probably always sending zeros. @Philippe, can vendor_data be 0 after migration? Otherwise 9.0 -> 9.1 migration might crash. > > Nothing I can think of otherwise indeed, if we want to trust that nothing > will migrate before 2016. It's just that we may want to know how that > "2016" is justified to be safe if we would like to allow that in the > future. It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included. > > One thing _could_ be that "rule of thumb" is we plan to obsolete machines > with 6 years, so anything "UNUSED" older than 6 years can be over-written?
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Peter Xu writes: > On Tue, Jul 09, 2024 at 05:38:54PM -0300, Fabiano Rosas wrote: >> Philippe Mathieu-Daudé writes: >> >> > "General command" (GEN_CMD, CMD56) is described as: >> > >> > GEN_CMD is the same as the single block read or write >> > commands (CMD24 or CMD17). The difference is that [...] >> > the data block is not a memory payload data but has a >> > vendor specific format and meaning. >> > >> > Thus this block must not be stored overwriting data block >> > on underlying storage drive. Keep it in a dedicated >> > 'vendor_data[]' array. >> > >> > Signed-off-by: Philippe Mathieu-Daudé >> > Tested-by: Cédric Le Goater >> > --- >> > RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens >> > to be the same size)? >> >> Hi, sorry it took some time to get to this, I had just left for vacation >> when you first posted. > > And I totally overlooked there's the email.. until you replied. Welcome > back. Thanks! > >> >> I think it's ok: >> >> { >> "field": "unused", >> "version_id": 1, >> "field_exists": false, >> "size": 512 >> }, >> >> vs. >> >> { >> "field": "vendor_data", >> "version_id": 0, >> "field_exists": false, >> "num": 512, >> "size": 1 >> }, >> >> The unused field was introduced in 2016 so there's no chance of >> migrating a QEMU that old to/from 9.1. > > What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the > new QEMU would consider it meaningful data? It will send zeros, no? The code will have to cope with that. The alternative is to put the vendor_data in a subsection and the code will also have to cope with the lack of data when the old QEMU doesn't send it.
Re: [PATCH v3 06/17] hw/sd/sdcard: Do not store vendor data on block drive (CMD56)
Philippe Mathieu-Daudé writes: > "General command" (GEN_CMD, CMD56) is described as: > > GEN_CMD is the same as the single block read or write > commands (CMD24 or CMD17). The difference is that [...] > the data block is not a memory payload data but has a > vendor specific format and meaning. > > Thus this block must not be stored overwriting data block > on underlying storage drive. Keep it in a dedicated > 'vendor_data[]' array. > > Signed-off-by: Philippe Mathieu-Daudé > Tested-by: Cédric Le Goater > --- > RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens > to be the same size)? Hi, sorry it took some time to get to this, I had just left for vacation when you first posted. I think it's ok: { "field": "unused", "version_id": 1, "field_exists": false, "size": 512 }, vs. { "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 }, The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.
Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
Thomas Huth writes: > On 21/05/2024 14.46, Fabiano Rosas wrote: >> Alex Bennée writes: >> >>> Juan Quintela writes: >>> >>>> From: Fabiano Rosas >>>> >>>> Add a smoke test that migrates to a file and gives it to the >>>> script. It should catch the most annoying errors such as changes in >>>> the ram flags. >>>> >>>> After code has been merged it becomes way harder to figure out what is >>>> causing the script to fail, the person making the change is the most >>>> likely to know right away what the problem is. >>>> >>>> Signed-off-by: Fabiano Rosas >>>> Acked-by: Thomas Huth >>>> Reviewed-by: Juan Quintela >>>> Signed-off-by: Juan Quintela >>>> Message-ID: <20231009184326.15777-7-faro...@suse.de> >>> >>> I bisected the failures I'm seeing on s390x to the introduction of this >>> script. I don't know if its simply a timeout on a relatively slow VM: >> >> What's the range of your bisect? That test has been disabled and then >> reenabled on s390x. It could be tripping the bisect. >> >> 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py >> test on s390x") >> 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x") >> >> I don't think that test itself could be timing out. It's a very simple >> test. It runs a migration and then uses the output to validate the >> script. > > Agreed, the analyze-migration.py is unlikely to be the issue - especially > since it seems to have been disabled again in commit 6f0771de903b ... > Fabiano, why did you disable it here again? The reason is not mentioned in > the commit description. Your patch 81c2c9dd5d was merged between my v1 and v2 on the list and I didn't notice so I messed up the rebase. I'll send a patch soon to fix that.
Re: [PULL 10/38] tests/qtest/migration: Add a test for the analyze-migration script
Alex Bennée writes: > Juan Quintela writes: > >> From: Fabiano Rosas >> >> Add a smoke test that migrates to a file and gives it to the >> script. It should catch the most annoying errors such as changes in >> the ram flags. >> >> After code has been merged it becomes way harder to figure out what is >> causing the script to fail, the person making the change is the most >> likely to know right away what the problem is. >> >> Signed-off-by: Fabiano Rosas >> Acked-by: Thomas Huth >> Reviewed-by: Juan Quintela >> Signed-off-by: Juan Quintela >> Message-ID: <20231009184326.15777-7-faro...@suse.de> > > I bisected the failures I'm seeing on s390x to the introduction of this > script. I don't know if its simply a timeout on a relatively slow VM: What's the range of your bisect? That test has been disabled and then reenabled on s390x. It could be tripping the bisect. 04131e0009 ("tests/qtest/migration-test: Disable the analyze-migration.py test on s390x") 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x") I don't think that test itself could be timing out. It's a very simple test. It runs a migration and then uses the output to validate the script. I don't have a Z machine at hand and the migration tests only run with KVM for s390x, so it would be useful to take a look at meson's testlog.txt so we can see which test is failing and hopefully in what way it is failing. If you're up for it, running this in a loop is usually the best way to catch any intermittent issues: QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test And once you figure out which test, there's this monstrosity: QTEST_QEMU_BINARY='gdb -q --ex "set pagination off" \ --ex "set print thread-events off" \ --ex "handle SIGUSR1 noprint" \ --ex "handle SIGPIPE noprint" \ --ex "run" --ex "quit \$_exitcode" \ --args ./qemu-system-x86_64' \ gdb -q --ex "set prompt (qtest) " \ --ex "handle SIGPIPE noprint" \ --args ./tests/qtest/migration-test -p /x86_64/migration// > Summary of Failures: > > 36/546 qemu:qtest+qtest-s390x / qtest-s390x/migration-test > ERROR 93.51s killed by signal 6 SIGABRT > > It seems to be unstable as we pass sometimes: > > 11:26:42 [ajb@qemu01:~/l/q/b/system] master|… + ./pyvenv/bin/meson test > --repeat 100 qtest-s390x/migration-test > ninja: Entering directory `/home/ajb/lsrc/qemu.git/builds/system' > [1/9] Generating qemu-version.h with a custom command (wrapped by meson to > capture output) > 1/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test ERROR > 251.98s killed by signal 6 SIGABRT >>>> MALLOC_PERTURB_=9 >>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 >>>> G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh >>>> QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img >>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon >>>> /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k > > 2/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test ERROR > 258.71s killed by signal 6 SIGABRT >>>> PYTHON=/home/ajb/lsrc/qemu.git/builds/system/pyvenv/bin/python3 >>>> MALLOC_PERTURB_=205 >>>> G_TEST_DBUS_DAEMON=/home/ajb/lsrc/qemu.git/tests/dbus-vmstate-daemon.sh >>>> QTEST_QEMU_BINARY=./qemu-system-s390x QTEST_QEMU_IMG=./qemu-img >>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon >>>> /home/ajb/lsrc/qemu.git/builds/system/tests/qtest/migration-test --tap -k > > 3/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test OK > 302.53s 46 subtests passed > 4/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test OK > 319.56s 46 subtests passed > 5/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test OK > 320.11s 46 subtests passed > 6/100 qemu:qtest+qtest-s390x / qtest-s390x/migration-test OK > 328.40s 46 subtests passed > > Ok: 4 > Expected Fail: 0 > Fail: 2 > Unexpected Pass:0 > Skipped:0 > Timeout:0 > >> --- >> tests/qtest/migration-test.c | 60 >>
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Markus Armbruster writes: > Functions that use an Error **errp parameter to return errors should > not also report them to the user, because reporting is the caller's > job. When the caller does, the error is reported twice. When it > doesn't (because it recovered from the error), there is no error to > report, i.e. the report is bogus. > > qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate > this principle: they call qemu_save_device_state() and > qemu_loadvm_state(), which call error_report_err(). > > I wish I could clean this up now, but migration's error reporting is > too complicated (confused?) for me to mess with it. > > Instead, I'm merely improving the error reported by > qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the > QMP core from > > An IO error has occurred > > to > saving Xen device state failed > > and > > loading Xen device state failed > > respectively. > > Signed-off-by: Markus Armbruster Acked-by: Fabiano Rosas
[PATCH v3 10/11] block: Convert qmp_query_block and qmp_query_named_block_nodes to coroutine
From: Lin Ma Convert the remaining functions to make the QMP commands query-block and query-named-block-nodes run in their entirety in a coroutine. With this, any yield from those commands will return all the way back to the main loop. This releases the BQL and the main loop and avoids having the QMP command block another more important task from running. Both commands need to be converted at once because hmp_info_block calls both and it needs to be moved to a coroutine as well. Now the wrapper for bdrv_co_get_allocated_file_size() can be made not mixed and the wrapper for bdrv_co_block_device_info() can be removed. Signed-off-by: Lin Ma Signed-off-by: Fabiano Rosas --- block.c| 8 block/monitor/block-hmp-cmds.c | 2 +- block/qapi.c | 12 ++-- blockdev.c | 8 hmp-commands-info.hx | 1 + include/block/block-global-state.h | 3 ++- include/block/block-hmp-cmds.h | 2 +- include/block/block-io.h | 2 +- include/block/qapi.h | 3 --- qapi/block-core.json | 5 +++-- 10 files changed, 23 insertions(+), 23 deletions(-) diff --git a/block.c b/block.c index 01478c5471..cba28f07fc 100644 --- a/block.c +++ b/block.c @@ -6207,18 +6207,18 @@ BlockDriverState *bdrv_find_node(const char *node_name) } /* Put this QMP function here so it can access the static graph_bdrv_states. */ -BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, - Error **errp) +BlockDeviceInfoList *coroutine_fn bdrv_co_named_nodes_list(bool flat, + Error **errp) { BlockDeviceInfoList *list; BlockDriverState *bs; GLOBAL_STATE_CODE(); -GRAPH_RDLOCK_GUARD_MAINLOOP(); +GRAPH_RDLOCK_GUARD(); list = NULL; QTAILQ_FOREACH(bs, _bdrv_states, node_list) { -BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp); +BlockDeviceInfo *info = bdrv_co_block_device_info(NULL, bs, flat, errp); if (!info) { qapi_free_BlockDeviceInfoList(list); return NULL; diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index 9db587c661..8ceff59688 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -738,7 +738,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info, } } -void hmp_info_block(Monitor *mon, const QDict *qdict) +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict) { BlockInfoList *block_list, *info; BlockDeviceInfoList *blockdev_list, *blockdev; diff --git a/block/qapi.c b/block/qapi.c index 9a59e5606f..c4514295ec 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -418,8 +418,8 @@ fail: } /* @p_info will be set only on success. */ -static void GRAPH_RDLOCK -bdrv_query_info(BlockBackend *blk, BlockInfo **p_info, Error **errp) +static void GRAPH_RDLOCK coroutine_fn +bdrv_co_query_info(BlockBackend *blk, BlockInfo **p_info, Error **errp) { BlockInfo *info = g_malloc0(sizeof(*info)); BlockDriverState *bs = blk_bs(blk); @@ -451,7 +451,7 @@ bdrv_query_info(BlockBackend *blk, BlockInfo **p_info, Error **errp) } if (bs && bs->drv) { -info->inserted = bdrv_block_device_info(blk, bs, false, errp); +info->inserted = bdrv_co_block_device_info(blk, bs, false, errp); if (info->inserted == NULL) { goto err; } @@ -661,13 +661,13 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level) return s; } -BlockInfoList *qmp_query_block(Error **errp) +BlockInfoList *coroutine_fn qmp_query_block(Error **errp) { BlockInfoList *head = NULL, **p_next = BlockBackend *blk; Error *local_err = NULL; -GRAPH_RDLOCK_GUARD_MAINLOOP(); +GRAPH_RDLOCK_GUARD(); for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) { BlockInfoList *info; @@ -677,7 +677,7 @@ BlockInfoList *qmp_query_block(Error **errp) } info = g_malloc0(sizeof(*info)); -bdrv_query_info(blk, >value, _err); +bdrv_co_query_info(blk, >value, _err); if (local_err) { error_propagate(errp, local_err); g_free(info); diff --git a/blockdev.c b/blockdev.c index 057601dcf0..fe3226c8c4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2744,13 +2744,13 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp) blockdev_do_action(, errp); } -BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat, - bool flat, - Error **errp) +BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat, + bool flat, + Error **errp
[PATCH v3 07/11] block: Convert bdrv_query_image_info to coroutine
This function is a caller of bdrv_do_query_node_info(), which have been converted to a coroutine. Convert this function as well so we're closer from having the whole qmp_query_block as a single coroutine. Also remove the wrapper for bdrv_co_do_query_node_info() now that all its callers are converted. Signed-off-by: Fabiano Rosas --- block/qapi.c | 16 +++- include/block/qapi.h | 8 2 files changed, 11 insertions(+), 13 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 7b1cf48246..5e263960a9 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -304,7 +304,7 @@ bdrv_co_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, } /** - * bdrv_query_image_info: + * bdrv_co_query_image_info: * @bs: block node to examine * @p_info: location to store image information * @flat: skip backing node information @@ -325,17 +325,15 @@ bdrv_co_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, * * @p_info will be set only on success. On error, store error in @errp. */ -void bdrv_query_image_info(BlockDriverState *bs, - ImageInfo **p_info, - bool flat, - bool skip_implicit_filters, - Error **errp) +void coroutine_fn +bdrv_co_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, + bool skip_implicit_filters, Error **errp) { ERRP_GUARD(); ImageInfo *info; info = g_new0(ImageInfo, 1); -bdrv_do_query_node_info(bs, qapi_ImageInfo_base(info), errp); +bdrv_co_do_query_node_info(bs, qapi_ImageInfo_base(info), errp); if (*errp) { goto fail; } @@ -353,8 +351,8 @@ void bdrv_query_image_info(BlockDriverState *bs, } if (backing) { -bdrv_query_image_info(backing, >backing_image, false, - skip_implicit_filters, errp); +bdrv_co_query_image_info(backing, >backing_image, false, + skip_implicit_filters, errp); if (*errp) { goto fail; } diff --git a/include/block/qapi.h b/include/block/qapi.h index 76be9cc3e5..5f7e3a690e 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -38,7 +38,10 @@ int GRAPH_RDLOCK bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, Error **errp); -void GRAPH_RDLOCK +void coroutine_fn GRAPH_RDLOCK +bdrv_co_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, + bool skip_implicit_filters, Error **errp); +void co_wrapper_bdrv_rdlock bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, bool skip_implicit_filters, Error **errp); @@ -58,7 +61,4 @@ void bdrv_node_info_dump(BlockNodeInfo *info, int indentation, bool protocol); void coroutine_fn GRAPH_RDLOCK bdrv_co_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp); -void co_wrapper_bdrv_rdlock -bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, -Error **errp); #endif -- 2.35.3
[PATCH v3 05/11] block: Run bdrv_do_query_node_info in a coroutine
Move this function into a coroutine so we can convert the whole qmp_query_block command into a coroutine in the next patches. Placing the entire command in a coroutine allow us to yield all the way back to the main loop, releasing the BQL and unblocking the main loop. When the whole conversion is completed, we'll be able to avoid a priority inversion that happens when a QMP command calls a slow (buggy) system call and blocks the vcpu thread from doing mmio due to contention on the BQL. About coroutine safety: Most callees have coroutine versions themselves and thus are safe to call in a coroutine. The remaining ones: - bdrv_refresh_filename, bdrv_get_full_backing_filename: String manipulation, nothing that would be unsafe for use in coroutines; - bdrv_get_format_name: Just accesses a field; - bdrv_get_specific_info, bdrv_query_snapshot_info_list: No locks or anything that would poll or block. (using a mixed wrapper for now, but after all callers are converted, this can become a coroutine exclusively) Signed-off-by: Fabiano Rosas --- - used the coroutine version of the called functions when available --- block/qapi.c | 11 ++- include/block/qapi.h | 8 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 2b5793f1d9..26a9510345 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -225,8 +225,9 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs, * Helper function for other query info functions. Store information about @bs * in @info, setting @errp on error. */ -static void GRAPH_RDLOCK -bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp) +void coroutine_fn +bdrv_co_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, + Error **errp) { int64_t size; const char *backing_filename; @@ -234,7 +235,7 @@ bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp) int ret; Error *err = NULL; -size = bdrv_getlength(bs); +size = bdrv_co_getlength(bs); if (size < 0) { error_setg_errno(errp, -size, "Can't get image size '%s'", bs->exact_filename); @@ -246,13 +247,13 @@ bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, Error **errp) info->filename= g_strdup(bs->filename); info->format = g_strdup(bdrv_get_format_name(bs)); info->virtual_size= size; -info->actual_size = bdrv_get_allocated_file_size(bs); +info->actual_size = bdrv_co_get_allocated_file_size(bs); info->has_actual_size = info->actual_size >= 0; if (bs->encrypted) { info->encrypted = true; info->has_encrypted = true; } -if (bdrv_get_info(bs, ) >= 0) { +if (bdrv_co_get_info(bs, ) >= 0) { if (bdi.cluster_size != 0) { info->cluster_size = bdi.cluster_size; info->has_cluster_size = true; diff --git a/include/block/qapi.h b/include/block/qapi.h index 54c48de26a..234730b3de 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -25,6 +25,7 @@ #ifndef BLOCK_QAPI_H #define BLOCK_QAPI_H +#include "block/block-common.h" #include "block/graph-lock.h" #include "block/snapshot.h" #include "qapi/qapi-types-block-core.h" @@ -49,4 +50,11 @@ void bdrv_image_info_specific_dump(ImageInfoSpecific *info_spec, const char *prefix, int indentation); void bdrv_node_info_dump(BlockNodeInfo *info, int indentation, bool protocol); + +void coroutine_fn GRAPH_RDLOCK +bdrv_co_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, + Error **errp); +void co_wrapper_bdrv_rdlock +bdrv_do_query_node_info(BlockDriverState *bs, BlockNodeInfo *info, +Error **errp); #endif -- 2.35.3
[PATCH v3 04/11] block: Reschedule query-block during qcow2 invalidation
There is a small window at the end of block device migration when devices are being re-activated. This includes a resetting of some fields of BDRVQcow2State at qcow2_co_invalidate_cache(). A concurrent QMP query-block command can call qcow2_get_specific_info() during this window and see the cleared values, which leads to an assert: qcow2_get_specific_info: Assertion `false' failed This is the same issue as Gitlab #1933, which has already been resolved[1], but there the fix applied only to non-coroutine commands. Once we move query-block to a coroutine the problem will manifest again. Add an operation blocker to the invalidation function to block the query info path during this window. Instead of failing query-block, which would be disruptive to users, use the blocker to know when to reschedule the coroutine back into the iohandler so it doesn't run while the BDRVQcow2State is inconsistent. To avoid failing query-block when all block operations are blocked, unblock the INFO operation at various places. This preserves the prior situations where query-block used to work. 1 - https://gitlab.com/qemu-project/qemu/-/issues/1933 Signed-off-by: Fabiano Rosas --- block.c | 1 + block/mirror.c | 1 + block/qcow2.c| 20 block/replication.c | 1 + blockjob.c | 1 + include/block/block-common.h | 1 + migration/block.c| 1 + 7 files changed, 26 insertions(+) diff --git a/block.c b/block.c index 468cf5e67d..01478c5471 100644 --- a/block.c +++ b/block.c @@ -1296,6 +1296,7 @@ static void GRAPH_WRLOCK bdrv_backing_attach(BdrvChild *c) parent->backing_blocker); bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_TARGET, parent->backing_blocker); +bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_INFO, parent->backing_blocker); } static void bdrv_backing_detach(BdrvChild *c) diff --git a/block/mirror.c b/block/mirror.c index 1bdce3b657..9f952f3fdd 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1191,6 +1191,7 @@ static void mirror_complete(Job *job, Error **errp) error_setg(>replace_blocker, "block device is in use by block-job-complete"); bdrv_op_block_all(s->to_replace, s->replace_blocker); +bdrv_op_unblock(s->to_replace, BLOCK_OP_TYPE_INFO, s->replace_blocker); bdrv_ref(s->to_replace); } diff --git a/block/qcow2.c b/block/qcow2.c index 956128b409..b0ec8009e3 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2834,6 +2834,7 @@ qcow2_co_invalidate_cache(BlockDriverState *bs, Error **errp) BdrvChild *data_file; int flags = s->flags; QCryptoBlock *crypto = NULL; +Error *blocker = NULL; QDict *options; int ret; @@ -2845,6 +2846,17 @@ qcow2_co_invalidate_cache(BlockDriverState *bs, Error **errp) crypto = s->crypto; s->crypto = NULL; +/* + * When qcow2_do_open() below reads the qcow header, it yields to + * wait for the I/O which allows a concurrent QMP query-block + * command to be dispatched on the same context before + * BDRVQcow2State has been completely repopulated. Block the + * query-info operation during this window to avoid having + * qcow2_get_specific_info() access bogus values. + */ +error_setg(, "invalidating cached metadata"); +bdrv_op_block(bs, BLOCK_OP_TYPE_INFO, blocker); + /* * Do not reopen s->data_file (i.e., have qcow2_do_close() not close it, * and then prevent qcow2_do_open() from opening it), because this function @@ -2864,6 +2876,8 @@ qcow2_co_invalidate_cache(BlockDriverState *bs, Error **errp) qemu_co_mutex_lock(>lock); ret = qcow2_do_open(bs, options, flags, false, errp); qemu_co_mutex_unlock(>lock); +bdrv_op_unblock(bs, BLOCK_OP_TYPE_INFO, blocker); +g_free(blocker); qobject_unref(options); if (ret < 0) { error_prepend(errp, "Could not reopen qcow2 layer: "); @@ -5240,6 +5254,12 @@ qcow2_get_specific_info(BlockDriverState *bs, Error **errp) ImageInfoSpecific *spec_info; QCryptoBlockInfo *encrypt_info = NULL; +if (qemu_in_coroutine() && +bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INFO, errp)) { +errp = NULL; +aio_co_reschedule_self(iohandler_get_aio_context()); +} + if (s->crypto != NULL) { encrypt_info = qcrypto_block_get_info(s->crypto, errp); if (!encrypt_info) { diff --git a/block/replication.c b/block/replication.c index ca6bd0a720..459d644673 100644 --- a/block/replication.c +++ b/block/replication.c @@ -577,6 +577,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode, } bdrv_op_block_all(top_bs, s->blocker); bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker); +bdrv_op_unblock(top_
[PATCH v3 06/11] block: Convert bdrv_query_block_graph_info to coroutine
We're converting callers of bdrv_co_get_allocated_file_size() to run in coroutines because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This function is a candidate because it calls bdrv_do_query_node_info(), which in turn calls bdrv_co_get_allocated_file_size(). All the functions called from bdrv_do_query_node_info() onwards are coroutine-safe, either have a coroutine version themselves[1] or are mostly simple code/string manipulation[2]. 1) bdrv_co_getlength(), bdrv_co_get_allocated_file_size(), bdrv_co_get_info(); 2) bdrv_refresh_filename(), bdrv_get_format_name(), bdrv_get_full_backing_filename(), bdrv_query_snapshot_info_list(), bdrv_get_specific_info(); Signed-off-by: Fabiano Rosas Reviewed-by: Hanna Czenczek --- block/qapi.c | 14 -- include/block/qapi.h | 6 +- qemu-img.c | 3 --- 3 files changed, 13 insertions(+), 10 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 26a9510345..7b1cf48246 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -370,7 +370,7 @@ fail: } /** - * bdrv_query_block_graph_info: + * bdrv_co_query_block_graph_info: * @bs: root node to start from * @p_info: location to store image information * @errp: location to store error information @@ -379,17 +379,19 @@ fail: * * @p_info will be set only on success. On error, store error in @errp. */ -void bdrv_query_block_graph_info(BlockDriverState *bs, - BlockGraphInfo **p_info, - Error **errp) +void coroutine_fn +bdrv_co_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, + Error **errp) { ERRP_GUARD(); BlockGraphInfo *info; BlockChildInfoList **children_list_tail; BdrvChild *c; +assert_bdrv_graph_readable(); + info = g_new0(BlockGraphInfo, 1); -bdrv_do_query_node_info(bs, qapi_BlockGraphInfo_base(info), errp); +bdrv_co_do_query_node_info(bs, qapi_BlockGraphInfo_base(info), errp); if (*errp) { goto fail; } @@ -403,7 +405,7 @@ void bdrv_query_block_graph_info(BlockDriverState *bs, QAPI_LIST_APPEND(children_list_tail, c_info); c_info->name = g_strdup(c->name); -bdrv_query_block_graph_info(c->bs, _info->info, errp); +bdrv_co_query_block_graph_info(c->bs, _info->info, errp); if (*errp) { goto fail; } diff --git a/include/block/qapi.h b/include/block/qapi.h index 234730b3de..76be9cc3e5 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -41,7 +41,11 @@ bdrv_query_snapshot_info_list(BlockDriverState *bs, void GRAPH_RDLOCK bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, bool skip_implicit_filters, Error **errp); -void GRAPH_RDLOCK + +void coroutine_fn GRAPH_RDLOCK +bdrv_co_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, + Error **errp); +void co_wrapper_bdrv_rdlock bdrv_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, Error **errp); diff --git a/qemu-img.c b/qemu-img.c index 7668f86769..19db8f18fc 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2958,10 +2958,7 @@ static BlockGraphInfoList *collect_image_info_list(bool image_opts, * duplicate the backing chain information that we obtain by walking * the chain manually here. */ -bdrv_graph_rdlock_main_loop(); bdrv_query_block_graph_info(bs, , ); -bdrv_graph_rdunlock_main_loop(); - if (err) { error_report_err(err); blk_unref(blk); -- 2.35.3
[PATCH v3 11/11] block: Add a thread-pool version of fstat
From: João Silva The fstat call can take a long time to finish when running over NFS. Add a version of it that runs in the thread pool. Adapt one of its users, raw_co_get_allocated_file size to use the new version. That function is called via QMP under the qemu_global_mutex so it has a large chance of blocking VCPU threads in case it takes too long to finish. Reviewed-by: Claudio Fontana Reviewed-by: Hanna Czenczek Signed-off-by: João Silva Signed-off-by: Fabiano Rosas --- block/file-posix.c | 40 +--- include/block/raw-aio.h | 4 +++- 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 35684f7e21..6fbf961244 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -226,6 +226,9 @@ typedef struct RawPosixAIOData { struct { unsigned long op; } zone_mgmt; +struct { +struct stat *st; +} fstat; }; } RawPosixAIOData; @@ -2616,6 +2619,34 @@ static void raw_close(BlockDriverState *bs) } } +static int handle_aiocb_fstat(void *opaque) +{ +RawPosixAIOData *aiocb = opaque; + +if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) { +return -errno; +} + +return 0; +} + +static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st) +{ +BDRVRawState *s = bs->opaque; +RawPosixAIOData acb; + +acb = (RawPosixAIOData) { +.bs = bs, +.aio_fildes = s->fd, +.aio_type = QEMU_AIO_FSTAT, +.fstat = { +.st = st, +}, +}; + +return raw_thread_pool_submit(handle_aiocb_fstat, ); +} + /** * Truncates the given regular file @fd to @offset and, when growing, fills the * new space according to @prealloc. @@ -2860,11 +2891,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs) static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs) { struct stat st; -BDRVRawState *s = bs->opaque; +int ret; -if (fstat(s->fd, ) < 0) { -return -errno; +ret = raw_co_fstat(bs, ); + +if (ret) { +return ret; } + return (int64_t)st.st_blocks * 512; } diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 20e000b8ef..0c6af8dc32 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -31,6 +31,7 @@ #define QEMU_AIO_ZONE_REPORT 0x0100 #define QEMU_AIO_ZONE_MGMT0x0200 #define QEMU_AIO_ZONE_APPEND 0x0400 +#define QEMU_AIO_FSTAT0x0800 #define QEMU_AIO_TYPE_MASK \ (QEMU_AIO_READ | \ QEMU_AIO_WRITE | \ @@ -42,7 +43,8 @@ QEMU_AIO_TRUNCATE | \ QEMU_AIO_ZONE_REPORT | \ QEMU_AIO_ZONE_MGMT | \ - QEMU_AIO_ZONE_APPEND) + QEMU_AIO_ZONE_APPEND | \ + QEMU_AIO_FSTAT) /* AIO flags */ #define QEMU_AIO_MISALIGNED 0x1000 -- 2.35.3
[PATCH v3 09/11] block: Don't query all block devices at hmp_nbd_server_start
We're currently doing a full query-block just to enumerate the devices for qmp_nbd_server_add and then discarding the BlockInfoList afterwards. Alter hmp_nbd_server_start to instead iterate explicitly over the block_backends list. This allows the removal of the dependency on qmp_query_block from hmp_nbd_server_start. This is desirable because we're about to move qmp_query_block into a coroutine and don't need to change the NBD code at the same time. Add the GRAPH_RDLOCK_GUARD_MAINLOOP macro because bdrv_skip_implicit_filters() needs the graph lock. Signed-off-by: Fabiano Rosas --- - add a comment explaining some checks are done to preserve previous behavior; - we need the strdup when assigning .device to preserve const. Just add a matching g_free(); - about the possible leak at qmp_nbd_server_add() unrelated to this patch, commit 8675cbd68b ("nbd: Utilize QAPI_CLONE for type conversion") mentions that the QAPI visitor will already free arg->name. --- block/monitor/block-hmp-cmds.c | 32 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index d954bec6f1..9db587c661 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -387,10 +387,12 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) bool writable = qdict_get_try_bool(qdict, "writable", false); bool all = qdict_get_try_bool(qdict, "all", false); Error *local_err = NULL; -BlockInfoList *block_list, *info; +BlockBackend *blk; SocketAddress *addr; NbdServerAddOptions export; +GRAPH_RDLOCK_GUARD_MAINLOOP(); + if (writable && !all) { error_setg(_err, "-w only valid together with -a"); goto exit; @@ -415,29 +417,43 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) /* Then try adding all block devices. If one fails, close all and * exit. */ -block_list = qmp_query_block(NULL); +for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) { +BlockDriverState *bs = blk_bs(blk); -for (info = block_list; info; info = info->next) { -if (!info->value->inserted) { +if (!*blk_name(blk)) { +continue; +} + +/* + * Note: historically we used to call qmp_query_block() to get + * the list of block devices. The two 'continue' cases below + * are the same as used by that function and are here to + * preserve behavior. + */ + +if (!blk_get_attached_dev(blk)) { +continue; +} + +bs = bdrv_skip_implicit_filters(bs); +if (!bs || !bs->drv) { continue; } export = (NbdServerAddOptions) { -.device = info->value->device, +.device = g_strdup(blk_name(blk)), .has_writable = true, .writable = writable, }; qmp_nbd_server_add(, _err); - +g_free(export.device); if (local_err != NULL) { qmp_nbd_server_stop(NULL); break; } } -qapi_free_BlockInfoList(block_list); - exit: hmp_handle_error(mon, local_err); } -- 2.35.3
[PATCH v3 08/11] block: Convert bdrv_block_device_info into co_wrapper
We're converting callers of bdrv_co_get_allocated_file_size() to run in coroutines because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This function is a candidate because it calls bdrv_query_image_info() -> bdrv_co_do_query_node_info() -> bdrv_co_get_allocated_file_size(). It is safe to turn this is a coroutine because the code it calls is made up of either simple accessors and string manipulation functions [1] or it has already been determined to be safe [2]. 1) bdrv_refresh_filename(), bdrv_is_read_only(), blk_enable_write_cache(), bdrv_cow_bs(), blk_get_public(), throttle_group_get_name(), bdrv_write_threshold_get(), bdrv_query_dirty_bitmaps(), throttle_group_get_config(), bdrv_filter_or_cow_bs(), bdrv_skip_implicit_filters() 2) bdrv_co_do_query_node_info() (see previous commits); This was the only caller of bdrv_query_image_info(), so we can remove the wrapper for that function now. Signed-off-by: Fabiano Rosas --- - used co_wrapper_bdrv_rdlock instead of co_wrapper --- block/qapi.c | 10 +- include/block/qapi.h | 13 ++--- 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 5e263960a9..9a59e5606f 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -41,10 +41,10 @@ #include "qemu/qemu-print.h" #include "sysemu/block-backend.h" -BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, -BlockDriverState *bs, -bool flat, -Error **errp) +BlockDeviceInfo *coroutine_fn bdrv_co_block_device_info(BlockBackend *blk, +BlockDriverState *bs, +bool flat, +Error **errp) { ERRP_GUARD(); ImageInfo **p_image_info; @@ -152,7 +152,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, * Skip automatically inserted nodes that the user isn't aware of for * query-block (blk != NULL), but not for query-named-block-nodes */ -bdrv_query_image_info(bs, p_image_info, flat, blk != NULL, errp); +bdrv_co_query_image_info(bs, p_image_info, flat, blk != NULL, errp); if (*errp) { qapi_free_BlockDeviceInfo(info); return NULL; diff --git a/include/block/qapi.h b/include/block/qapi.h index 5f7e3a690e..9f0e957963 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -30,10 +30,12 @@ #include "block/snapshot.h" #include "qapi/qapi-types-block-core.h" -BlockDeviceInfo * GRAPH_RDLOCK -bdrv_block_device_info(BlockBackend *blk, BlockDriverState *bs, - bool flat, Error **errp); - +BlockDeviceInfo *coroutine_fn GRAPH_RDLOCK +bdrv_co_block_device_info(BlockBackend *blk, BlockDriverState *bs, bool flat, + Error **errp); +BlockDeviceInfo *co_wrapper_bdrv_rdlock +bdrv_block_device_info(BlockBackend *blk, BlockDriverState *bs, bool flat, + Error **errp); int GRAPH_RDLOCK bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, @@ -41,9 +43,6 @@ bdrv_query_snapshot_info_list(BlockDriverState *bs, void coroutine_fn GRAPH_RDLOCK bdrv_co_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, bool skip_implicit_filters, Error **errp); -void co_wrapper_bdrv_rdlock -bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, - bool skip_implicit_filters, Error **errp); void coroutine_fn GRAPH_RDLOCK bdrv_co_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, -- 2.35.3
[PATCH v3 03/11] block: Take the graph lock in bdrv_snapshot_list
This function has up until now always ran in the main loop, outside of a coroutine. We're about to make it run inside a coroutine so start actually taking the graph lock. Signed-off-by: Fabiano Rosas --- block/snapshot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/snapshot.c b/block/snapshot.c index 8242b4abac..3db9536ca2 100644 --- a/block/snapshot.c +++ b/block/snapshot.c @@ -389,7 +389,7 @@ int bdrv_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_info) { GLOBAL_STATE_CODE(); -GRAPH_RDLOCK_GUARD_MAINLOOP(); +GRAPH_RDLOCK_GUARD(); BlockDriver *drv = bs->drv; BlockDriverState *fallback_bs = bdrv_snapshot_fallback(bs); -- 2.35.3
[PATCH v3 02/11] block: Temporarily mark bdrv_co_get_allocated_file_size as mixed
Some callers of this function are about to be converted to run in coroutines, so allow it to be executed both inside and outside a coroutine while we convert all the callers. This will be reverted once all callers of bdrv_do_query_node_info run in a coroutine. Signed-off-by: Fabiano Rosas Reviewed-by: Eric Blake Reviewed-by: Hanna Czenczek --- include/block/block-io.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/block/block-io.h b/include/block/block-io.h index b49e0537dd..349d7760a1 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -86,7 +86,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs); int64_t coroutine_fn GRAPH_RDLOCK bdrv_co_get_allocated_file_size(BlockDriverState *bs); -int64_t co_wrapper_bdrv_rdlock +int64_t co_wrapper_mixed_bdrv_rdlock bdrv_get_allocated_file_size(BlockDriverState *bs); BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts, -- 2.35.3
[PATCH v3 00/11] block: Convert qmp_query_block into a coroutine
Hi, it's been a while since the last version, so a recap: This series converts qmp_query_block() & qmp_query_named_block_nodes() to coroutines so we can yield from them all the way back into the main loop. This addresses a vcpu softlockup encountered when querying a disk placed on NFS. If the NFS server happens to have high latency, an fstat() issued from raw_co_get_allocated_file_size() could take seconds while the whole QMP command is holding the BQL and blocks a vcpu thread going out of the guest to handle IO. This scenario is clearly undesireable since a query command is of much lower priority than the vcpu thread doing actual work. Move the 'fstat' call into the thread-pool and make the necessary adaptations to ensure the whole QMP command that calls raw_co_get_allocated_file_size() runs in a coroutine. Changes since v2: - Do the changes more gradually to make it easier to reason about the safety of the change. - Patch 4 addresses the issue I asked about recently on the ml [1] about how to avoid dispatching the QMP command during an aio_poll(). - Converted qmp_query_block and qmp_query_named_block_nodes in a single patch to avoid having hmp_info_block call a coroutine_fn out of coroutine context. On v2, Hanna asked: "I wonder how the threading is actually supposed to work. I assume QMP coroutines run in the main thread, so now we run bdrv_co_get_allocated_file_size() in the main thread – is that correct, or do we need to use bdrv_co_enter() like qmp_block_resize() does? And so, if we run it in the main thread, is it OK not to acquire the AioContext around it to prevent interference from a potential I/O thread?" The QMP coroutines and also bdrv_co_get_allocated_file_size() run in the main thread. This series doesn't change that. The difference is that instead of bdrv_co_get_allocated_file_size() yielding back to bdrv_poll(), it now yields back to the main loop. As for thread safety, that's basically what I asked about in [1], so I'm still gathering information and don't have a definite answer for it. Since we don't have the AioContext lock anymore, it seems that safety is now dependant on not dispatching the QMP command while other operations are ongoing. Still, for this particular case of fstat(), I don't think interference of an I/O thread could cause any problems, as long as the file descriptor is not closed prematurely. The fstat() manual already mentions that it is succeptible to return old information in some cases. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1244905208 1- Advice on block QMP command coroutines https://lore.kernel.org/r/87bk6trl9i@suse.de Initial discussion: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html v1: https://lore.kernel.org/r/20230523213903.18418-1-faro...@suse.de v2: https://lore.kernel.org/r/20230609201910.12100-1-faro...@suse.de Fabiano Rosas (9): block: Allow the wrapper script to see functions declared in qapi.h block: Temporarily mark bdrv_co_get_allocated_file_size as mixed block: Take the graph lock in bdrv_snapshot_list block: Reschedule query-block during qcow2 invalidation block: Run bdrv_do_query_node_info in a coroutine block: Convert bdrv_query_block_graph_info to coroutine block: Convert bdrv_query_image_info to coroutine block: Convert bdrv_block_device_info into co_wrapper block: Don't query all block devices at hmp_nbd_server_start João Silva (1): block: Add a thread-pool version of fstat Lin Ma (1): block: Convert qmp_query_block and qmp_query_named_block_nodes to coroutine block.c| 9 +++-- block/file-posix.c | 40 +-- block/meson.build | 1 + block/mirror.c | 1 + block/monitor/block-hmp-cmds.c | 34 +++- block/qapi.c | 63 +++--- block/qcow2.c | 20 ++ block/replication.c| 1 + block/snapshot.c | 2 +- blockdev.c | 8 ++-- blockjob.c | 1 + hmp-commands-info.hx | 1 + include/block/block-common.h | 1 + include/block/block-global-state.h | 3 +- include/block/block-hmp-cmds.h | 2 +- include/block/qapi.h | 24 include/block/raw-aio.h| 4 +- migration/block.c | 1 + qapi/block-core.json | 5 ++- qemu-img.c | 3 -- scripts/block-coroutine-wrapper.py | 1 + 21 files changed, 157 insertions(+), 68 deletions(-) -- 2.35.3
[PATCH v3 01/11] block: Allow the wrapper script to see functions declared in qapi.h
The following patches will add co_wrapper annotations to functions declared in qapi.h. Add that header to the set of files used by block-coroutine-wrapper.py. Reviewed-by: Hanna Czenczek Signed-off-by: Fabiano Rosas --- block/meson.build | 1 + scripts/block-coroutine-wrapper.py | 1 + 2 files changed, 2 insertions(+) diff --git a/block/meson.build b/block/meson.build index e1f03fd773..8fe684d301 100644 --- a/block/meson.build +++ b/block/meson.build @@ -154,6 +154,7 @@ block_gen_c = custom_target('block-gen.c', '../include/block/dirty-bitmap.h', '../include/block/block_int-io.h', '../include/block/block-global-state.h', + '../include/block/qapi.h', '../include/sysemu/block-backend-global-state.h', '../include/sysemu/block-backend-io.h', 'coroutines.h' diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py index dbbde99e39..067b203801 100644 --- a/scripts/block-coroutine-wrapper.py +++ b/scripts/block-coroutine-wrapper.py @@ -44,6 +44,7 @@ def gen_header(): #include "block/block-gen.h" #include "block/block_int.h" #include "block/dirty-bitmap.h" +#include "block/qapi.h" """ -- 2.35.3
Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
Philippe Mathieu-Daudé writes: > The whole RDMA subsystem was deprecated in commit e9a54265f5 > ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") > released in v8.2. > > Remove: > - RDMA handling from migration > - dependencies on libibumad, libibverbs and librdmacm > > Keep the RAM_SAVE_FLAG_HOOK definition since it might appears > in old migration streams. > > Cc: Peter Xu > Cc: Li Zhijian > Acked-by: Fabiano Rosas > Signed-off-by: Philippe Mathieu-Daudé Just to be clear, because people raised the point in the last version, the first link in the deprecation commit links to a thread comprising entirely of rdma migration patches. I don't see any ambiguity on whether the deprecation was intended to include migration. There's even an ack from Juan. So on the basis of not reverting the previous maintainer's decision, my Ack stands here. We also had pretty obvious bugs ([1], [2]) in the past that would have been caught if we had any kind of testing for the feature, so I can't even say this thing works currently. @Peter Xu, @Li Zhijian, what are your thoughts on this? 1- https://lore.kernel.org/r/20230920090412.726725-1-lizhij...@fujitsu.com 2- https://lore.kernel.org/r/cahecvy7hxswn4ow_kog+q+tn6f_kmeichevz1qgm-fbxbpp...@mail.gmail.com
Re: [PATCH-for-9.1] rdma: Remove RDMA subsystem and pvrdma device
Philippe Mathieu-Daudé writes: > The whole RDMA subsystem was deprecated in commit e9a54265f5 > ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem") > released in v8.2. Time to remove it. > > Keep the RAM_SAVE_FLAG_HOOK definition since it might appears > in old migration streams. > > Remove the dependencies on libibumad and libibverbs. > > Remove the generated vmw_pvrdma/ directory from linux-headers. > > Remove RDMA handling from migration. > > Remove RDMA handling in GlusterFS block driver. > > Remove rdmacm-mux tool from contrib/. > > Remove PVRDMA device. > > Cc: Peter Xu > Cc: Li Zhijian > Cc: Yuval Shaia > Cc: Marcel Apfelbaum > Signed-off-by: Philippe Mathieu-Daudé Acked-by: Fabiano Rosas
Re: [PATCH 1/6] system/cpus: rename qemu_mutex_lock_iothread() to qemu_bql_lock()
Stefan Hajnoczi writes: > The Big QEMU Lock (BQL) has many names and they are confusing. The > actual QemuMutex variable is called qemu_global_mutex but it's commonly > referred to as the BQL in discussions and some code comments. The > locking APIs, however, are called qemu_mutex_lock_iothread() and > qemu_mutex_unlock_iothread(). > > The "iothread" name is historic and comes from when the main thread was > split into into KVM vcpu threads and the "iothread" (now called the main > loop thread). I have contributed to the confusion myself by introducing > a separate --object iothread, a separate concept unrelated to the BQL. > > The "iothread" name is no longer appropriate for the BQL. Rename the > locking APIs to: > - void qemu_bql_lock(void) > - void qemu_bql_unlock(void) > - bool qemu_bql_locked(void) > > There are more APIs with "iothread" in their names. Subsequent patches > will rename them. There are also comments and documentation that will be > updated in later patches. > > Signed-off-by: Stefan Hajnoczi Acked-by: Fabiano Rosas
Re: [PATCH v2 09/10] block: Convert qmp_query_block() to coroutine_fn
Hanna Czenczek writes: > On 09.06.23 22:19, Fabiano Rosas wrote: >> This is another caller of bdrv_get_allocated_file_size() that needs to >> be converted to a coroutine because that function will be made >> asynchronous when called (indirectly) from the QMP dispatcher. >> >> This QMP command is a candidate because it calls bdrv_do_query_node_info(), >> which in turn calls bdrv_get_allocated_file_size(). >> >> We've determined bdrv_do_query_node_info() to be coroutine-safe (see >> previous commits), so we can just put this QMP command in a coroutine. >> >> Since qmp_query_block() now expects to run in a coroutine, its callers >> need to be converted as well. Convert hmp_info_block(), which calls >> only coroutine-safe code, including qmp_query_named_block_nodes() >> which has been converted to coroutine in the previous patches. >> >> Now that all callers of bdrv_[co_]block_device_info() are using the >> coroutine version, a few things happen: >> >> - we can return to using bdrv_block_device_info() without a wrapper; >> >> - bdrv_get_allocated_file_size() can stop being mixed; >> >> - bdrv_co_get_allocated_file_size() needs to be put under the graph >> lock because it is being called wthout the wrapper; >> >> - bdrv_do_query_node_info() doesn't need to acquire the AioContext >> because it doesn't call aio_poll anymore; >> >> Signed-off-by: Fabiano Rosas >> --- >> block.c| 2 +- >> block/monitor/block-hmp-cmds.c | 2 +- >> block/qapi.c | 18 +- >> hmp-commands-info.hx | 1 + >> include/block/block-hmp-cmds.h | 2 +- >> include/block/block-io.h | 2 +- >> include/block/qapi.h | 12 >> qapi/block-core.json | 2 +- >> 8 files changed, 19 insertions(+), 22 deletions(-) > > After this series has been sent, we got some usages of > GRAPH_RDLOCK_GUARD_MAINLOOP() that no longer fit with this patch – I’ve > also mentioned one case on patch 7, not yet realizing that this was a > new thing. Those must now be fixed (e.g. in qmp_query_block(), or in > bdrv_snapshot_list()), or they’ll crash. Hi, thanks for the reviews! Yes, there's been a lot of changes since this series was sent. I'll rebase it and re-evaluate. Stefan just sent an AioContext series which will probably help bring the complexity of down for this series. Let's see how it goes.
Re: [PATCH v2 01/12] qemu-file: Don't increment qemu_file_transferred at qemu_file_fill_buffer
Juan Quintela writes: > We only call qemu_file_transferred_* on the sending side. Remove the > increment at qemu_file_fill_buffer() and add asserts to > qemu_file_transferred* functions. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 12/12] qemu-file: Make qemu_fflush() return errors
Juan Quintela writes: > This let us simplify code of this shape. > >qemu_fflush(f); >int ret = qemu_file_get_error(f); >if (ret) { > return ret; >} > > into: > >int ret = qemu_fflush(f); >if (ret) { > return ret; >} > > I updated all callers where there is any error check. > qemu_fclose() don't need to check for f->last_error because > qemu_fflush() returns it at the beggining of the function. > > Reviewed-by: Philippe Mathieu-Daudé > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 11/12] migration: Remove transferred atomic counter
Juan Quintela writes: > After last commit, it is a write only variable. > > Signed-off-by: Juan Quintela > --- > migration/migration-stats.h | 4 > migration/multifd.c | 3 --- > migration/ram.c | 1 - > 3 files changed, 8 deletions(-) > > diff --git a/migration/migration-stats.h b/migration/migration-stats.h > index 68f3939188..05290ade76 100644 > --- a/migration/migration-stats.h > +++ b/migration/migration-stats.h > @@ -97,10 +97,6 @@ typedef struct { > * Number of bytes sent through RDMA. > */ > Stat64 rdma_bytes; > -/* > - * Total number of bytes transferred. > - */ > -Stat64 transferred; > /* > * Number of pages transferred that were full of zeros. > */ > diff --git a/migration/multifd.c b/migration/multifd.c > index e2a45c667a..ec58c58082 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -188,7 +188,6 @@ static int multifd_send_initial_packet(MultiFDSendParams > *p, Error **errp) > return -1; > } > stat64_add(_stats.multifd_bytes, size); > -stat64_add(_stats.transferred, size); > return 0; > } > > @@ -733,8 +732,6 @@ static void *multifd_send_thread(void *opaque) > > stat64_add(_stats.multifd_bytes, > p->next_packet_size + p->packet_len); > -stat64_add(_stats.transferred, > - p->next_packet_size + p->packet_len); > p->next_packet_size = 0; > qemu_mutex_lock(>mutex); > p->pending_job--; > diff --git a/migration/ram.c b/migration/ram.c > index 5ccf70333a..6d2bf50614 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -455,7 +455,6 @@ void ram_transferred_add(uint64_t bytes) > } else { > stat64_add(_stats.downtime_bytes, bytes); > } > -stat64_add(_stats.transferred, bytes); > } > > struct MigrationOps { Reviewed-by: Fabiano Rosas
Re: [PATCH 10/12] migration: Use migration_transferred_bytes()
Juan Quintela writes: > There are only two differnces with the old value: > > - the amount of QEMUFile that hasn't yet been flushed. It can be > discussed what is more exact, the new or the old one. > - the amount of transferred bytes that we forgot to account for (the > newer is better, i.e. exact). > > Notice that this two values are used to: > a - present to the user > b - calculate the rate_limit > > So a few KB here and there is not going to make a difference. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 06/12] qemu-file: Remove _noflush from qemu_file_transferred_noflush()
Juan Quintela writes: > qemu_file_transferred() don't exist anymore, so we can reuse the name. > > Signed-off-by: Juan Quintela > > --- > > v2: Update the documentation (thanks fabiano) > --- > migration/qemu-file.h | 9 - > migration/block.c | 4 ++-- > migration/qemu-file.c | 2 +- > migration/savevm.c| 6 +++--- > migration/vmstate.c | 4 ++-- > 5 files changed, 12 insertions(+), 13 deletions(-) > > diff --git a/migration/qemu-file.h b/migration/qemu-file.h > index 8b71152754..1b2f6b8d8f 100644 > --- a/migration/qemu-file.h > +++ b/migration/qemu-file.h > @@ -34,15 +34,14 @@ QEMUFile *qemu_file_new_output(QIOChannel *ioc); > int qemu_fclose(QEMUFile *f); > > /* > - * qemu_file_transferred_noflush: > + * qemu_file_transferred: > * > - * As qemu_file_transferred except for writable files, where no flush > - * is performed and the reported amount will include the size of any > - * queued buffers, on top of the amount actually transferred. > + * No flush is performed and the reported amount will include the size > + * of any queued buffers, on top of the amount actually transferred. > * > * Returns: the total bytes transferred and queued > */ > -uint64_t qemu_file_transferred_noflush(QEMUFile *f); > +uint64_t qemu_file_transferred(QEMUFile *f); > > /* > * put_buffer without copying the buffer. > diff --git a/migration/block.c b/migration/block.c > index b60698d6e2..47f11d0e4f 100644 > --- a/migration/block.c > +++ b/migration/block.c > @@ -752,7 +752,7 @@ static int block_save_setup(QEMUFile *f, void *opaque) > static int block_save_iterate(QEMUFile *f, void *opaque) > { > int ret; > -uint64_t last_bytes = qemu_file_transferred_noflush(f); > +uint64_t last_bytes = qemu_file_transferred(f); > > trace_migration_block_save("iterate", block_mig_state.submitted, > block_mig_state.transferred); > @@ -804,7 +804,7 @@ static int block_save_iterate(QEMUFile *f, void *opaque) > } > > qemu_put_be64(f, BLK_MIG_FLAG_EOS); > -uint64_t delta_bytes = qemu_file_transferred_noflush(f) - last_bytes; > +uint64_t delta_bytes = qemu_file_transferred(f) - last_bytes; > return (delta_bytes > 0); > } > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index efa5f11033..0158db2a54 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -618,7 +618,7 @@ int coroutine_mixed_fn qemu_get_byte(QEMUFile *f) > return result; > } > > -uint64_t qemu_file_transferred_noflush(QEMUFile *f) > +uint64_t qemu_file_transferred(QEMUFile *f) > { > uint64_t ret = stat64_get(_stats.qemu_file_transferred); > int i; > diff --git a/migration/savevm.c b/migration/savevm.c > index 8622f229e5..9c90499609 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -927,9 +927,9 @@ static int vmstate_load(QEMUFile *f, SaveStateEntry *se) > static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se, > JSONWriter *vmdesc) > { > -uint64_t old_offset = qemu_file_transferred_noflush(f); > +uint64_t old_offset = qemu_file_transferred(f); > se->ops->save_state(f, se->opaque); > -uint64_t size = qemu_file_transferred_noflush(f) - old_offset; > +uint64_t size = qemu_file_transferred(f) - old_offset; > > if (vmdesc) { > json_writer_int64(vmdesc, "size", size); > @@ -3053,7 +3053,7 @@ bool save_snapshot(const char *name, bool overwrite, > const char *vmstate, > goto the_end; > } > ret = qemu_savevm_state(f, errp); > -vm_state_size = qemu_file_transferred_noflush(f); > +vm_state_size = qemu_file_transferred(f); > ret2 = qemu_fclose(f); > if (ret < 0) { > goto the_end; > diff --git a/migration/vmstate.c b/migration/vmstate.c > index 1cf9e45b85..16420fa9a3 100644 > --- a/migration/vmstate.c > +++ b/migration/vmstate.c > @@ -386,7 +386,7 @@ int vmstate_save_state_v(QEMUFile *f, const > VMStateDescription *vmsd, > void *curr_elem = first_elem + size * i; > > vmsd_desc_field_start(vmsd, vmdesc_loop, field, i, n_elems); > -old_offset = qemu_file_transferred_noflush(f); > +old_offset = qemu_file_transferred(f); > if (field->flags & VMS_ARRAY_OF_POINTER) { > assert(curr_elem); > curr_elem = *(void **)curr_elem; > @@ -416,7 +416,7 @@ int vmstate_save_state_v(QEMUFile *f, const > VMStateDescription *vmsd, > return ret; > } > > -written_bytes = qemu_file_transferred_noflush(f) - > old_offset; > +written_bytes = qemu_file_transferred(f) - old_offset; > vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes, > i); > > /* Compressed arrays only care about the first element */ Reviewed-by: Fabiano Rosas
Re: [PATCH 05/12] qemu_file: Remove unused qemu_file_transferred()
Juan Quintela writes: > Signed-off-by: Juan Quintela > --- > migration/qemu-file.h | 18 -- > migration/qemu-file.c | 7 --- > 2 files changed, 25 deletions(-) > > diff --git a/migration/qemu-file.h b/migration/qemu-file.h > index a29c37b0d0..8b71152754 100644 > --- a/migration/qemu-file.h > +++ b/migration/qemu-file.h > @@ -33,24 +33,6 @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); > QEMUFile *qemu_file_new_output(QIOChannel *ioc); > int qemu_fclose(QEMUFile *f); > > -/* > - * qemu_file_transferred: > - * > - * Report the total number of bytes transferred with > - * this file. > - * > - * For writable files, any pending buffers will be > - * flushed, so the reported value will be equal to > - * the number of bytes transferred on the wire. > - * > - * For readable files, the reported value will be > - * equal to the number of bytes transferred on the > - * wire. > - * > - * Returns: the total bytes transferred > - */ > -uint64_t qemu_file_transferred(QEMUFile *f); > - > /* > * qemu_file_transferred_noflush: > * > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index 641ab703cc..efa5f11033 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -632,13 +632,6 @@ uint64_t qemu_file_transferred_noflush(QEMUFile *f) > return ret; > } > > -uint64_t qemu_file_transferred(QEMUFile *f) > -{ > -g_assert(qemu_file_is_writable(f)); > -qemu_fflush(f); > -return stat64_get(_stats.qemu_file_transferred); > -} > - > void qemu_put_be16(QEMUFile *f, unsigned int v) > { > qemu_put_byte(f, v >> 8); Reviewed-by: Fabiano Rosas
Re: [PATCH 04/12] migration: Use the number of transferred bytes directly
Juan Quintela writes: > We only use migration_transferred_bytes() to calculate the rate_limit, > for that we don't need to flush whatever is on the qemu_file buffer. > Remember that the buffer is really small (normal case is 32K if we use > iov's can be 64 * TARGET_PAGE_SIZE), so this is not relevant to > calculations. > > Signed-off-by: Juan Quintela > --- > migration/migration-stats.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/migration-stats.c b/migration/migration-stats.c > index 4cc989d975..1d9197b4c3 100644 > --- a/migration/migration-stats.c > +++ b/migration/migration-stats.c > @@ -63,7 +63,7 @@ uint64_t migration_transferred_bytes(QEMUFile *f) > { > uint64_t multifd = stat64_get(_stats.multifd_bytes); > uint64_t rdma = stat64_get(_stats.rdma_bytes); > -uint64_t qemu_file = qemu_file_transferred(f); > +uint64_t qemu_file = stat64_get(_stats.qemu_file_transferred); > > trace_migration_transferred_bytes(qemu_file, multifd, rdma); > return qemu_file + multifd + rdma; Reviewed-by: Fabiano Rosas
Re: [PATCH 03/12] qemu_file: total_transferred is not used anymore
Juan Quintela writes: > Signed-off-by: Juan Quintela > --- > migration/qemu-file.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index 384985f534..641ab703cc 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -41,9 +41,6 @@ struct QEMUFile { > QIOChannel *ioc; > bool is_writable; > > -/* The sum of bytes transferred on the wire */ > -uint64_t total_transferred; > - > int buf_index; > int buf_size; /* 0 when writing */ > uint8_t buf[IO_BUF_SIZE]; > @@ -282,7 +279,6 @@ void qemu_fflush(QEMUFile *f) > qemu_file_set_error_obj(f, -EIO, local_error); > } else { > uint64_t size = iov_size(f->iov, f->iovcnt); > -f->total_transferred += size; > stat64_add(_stats.qemu_file_transferred, size); > } Reviewed-by: Fabiano Rosas
Re: [PATCH 02/12] qemu_file: Use a stat64 for qemu_file_transferred
Juan Quintela writes: > This way we can read it from any thread. > I checked that it gives the same value than the current one. We never > use to qemu_files at the same time. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 01/12] qemu-file: We only call qemu_file_transferred_* on the sending side
Juan Quintela writes: > Remove the increase in qemu_file_fill_buffer() and add asserts to > qemu_file_transferred* functions. Patch looks ok, but I would rewrite the whole commit message like this: Don't increment qemu_file_transferred at qemu_file_fill_buffer We only call qemu_file_transferred_* on the sending side. Remove the increment at qemu_file_fill_buffer() and add asserts to qemu_file_transferred* functions.
Re: [PATCH v5 1/7] migration: Print block status when needed
Juan Quintela writes: > The new line was only printed when command options were used. When we > used migration parameters and capabilities, it wasn't. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration
Juan Quintela writes: > Fabiano Rosas wrote: >> Juan Quintela writes: >> >>> Fabiano Rosas wrote: >>> D> Juan Quintela writes: >>>> >>>>> From: Fabiano Rosas >>>>> >>>>> Add basic tests for file-based migration. >>>>> >>>>> Note that we cannot use test_precopy_common because that routine >>>>> expects it to be possible to run the migration live. With the file >>>>> transport there is no live migration because we must wait for the >>>>> source to finish writing the migration data to the file before the >>>>> destination can start reading. Add a new migration function >>>>> specifically to handle the file migration. >>>>> >>>>> Reviewed-by: Peter Xu >>>>> Reviewed-by: Juan Quintela >>>>> Signed-off-by: Fabiano Rosas >>>>> Signed-off-by: Juan Quintela >>>>> Message-ID: <20230712190742.22294-7-faro...@suse.de> >>> >>>>> +static void file_offset_finish_hook(QTestState *from, QTestState *to, >>>>> +void *opaque) >>>>> +{ >>>>> +#if defined(__linux__) >>>>> +g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, >>>>> FILE_TEST_FILENAME); >>>>> +size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); >>>>> +uintptr_t *addr, *p; >>>>> +int fd; >>>>> + >>>>> +fd = open(path, O_RDONLY); >>>>> +g_assert(fd != -1); >>>>> +addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0); >>>>> +g_assert(addr != MAP_FAILED); >>>>> + >>>>> +/* >>>>> + * Ensure the skipped offset contains zeros and the migration >>>>> + * stream starts at the right place. >>>>> + */ >>>>> +p = addr; >>>>> +while (p < addr + FILE_TEST_OFFSET / sizeof(uintptr_t)) { >>>>> +g_assert(*p == 0); >>>>> +p++; >>>>> +} >>>>> +g_assert_cmpint(cpu_to_be32(*p), ==, QEMU_VM_FILE_MAGIC); >>>> >>>> This truncates to 32-bits, so it breaks on a BE host. We need this: >>>> >>>> -->8-- >>>> From ea0c2d1c988add48d9754891a9fc7f6854a9718a Mon Sep 17 00:00:00 2001 >>>> From: Fabiano Rosas >>>> Date: Mon, 16 Oct 2023 15:21:49 -0300 >>>> Subject: [PATCH] fixup! tests/qtest: migration-test: Add tests for >>>> file-based >>>> migration >>>> >>>> --- >>>> tests/qtest/migration-test.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c >>>> index da02b6d692..e1c110537b 100644 >>>> --- a/tests/qtest/migration-test.c >>>> +++ b/tests/qtest/migration-test.c >>>> @@ -1966,7 +1966,7 @@ static void file_offset_finish_hook(QTestState >>>> *from, QTestState *to, >>>> g_assert(*p == 0); >>>> p++; >>>> } >>>> -g_assert_cmpint(cpu_to_be32(*p), ==, QEMU_VM_FILE_MAGIC); >>>> +g_assert_cmpint(cpu_to_be64(*p) >> 32, ==, QEMU_VM_FILE_MAGIC); >>>> >>>> munmap(addr, size); >>>> close(fd); >>> >>> I am resubmitting with this change. >>> >>> But I think we need to change this: >>> >>>>> +g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, >>>>> FILE_TEST_FILENAME); >>>>> +size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); >>>>> +uintptr_t *addr, *p; >>> >>> I think we should change the test so the file is 64 bits on every >>> architecture. >>> Then we can cast to void * or uintptr_t as needed. >> >> Hm, I don't get what you mean here. What needs to be 64 bits? > > size_t is 32 bits on 32bits host, and 64 bits on 64 bits host. > uintprt_t is the same. Right, I have thought of that when writing this fix yesterday, but I dismissed it because I thought we were never have a 32bit host running these tests. > So using explicit sizes: > > static void file_offset_finish_hook(QTestState *from, QTestState *to, > void *opaque) > { > #if defined(__linux__) > g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, > FILE_TEST_FILENAME); > size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); > uint64_t *addr, *p; > int fd; > > fd = open(path, O_RDONLY); > g_assert(fd != -1); > addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0); > g_assert(addr != MAP_FAILED); > > /* > * Ensure the skipped offset contains zeros and the migration > * stream starts at the right place. > */ > p = addr; > while (p < (uintprt_t)addr + FILE_TEST_OFFSET) { > g_assert(*p == 0); > p++; > } > g_assert_cmpint(cpu_to_be64(*p) >> 32, ==, QEMU_VM_FILE_MAGIC); > > munmap(addr, size); > close(fd); > #endif > } > > This is completely untested, but it should make sure that we are reading > 64bits integers in both 32 and 64 bits hosts, no? Looks like it. I can give it a try and send an update as a separate patch.
Re: [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration
Juan Quintela writes: > Fabiano Rosas wrote: > D> Juan Quintela writes: >> >>> From: Fabiano Rosas >>> >>> Add basic tests for file-based migration. >>> >>> Note that we cannot use test_precopy_common because that routine >>> expects it to be possible to run the migration live. With the file >>> transport there is no live migration because we must wait for the >>> source to finish writing the migration data to the file before the >>> destination can start reading. Add a new migration function >>> specifically to handle the file migration. >>> >>> Reviewed-by: Peter Xu >>> Reviewed-by: Juan Quintela >>> Signed-off-by: Fabiano Rosas >>> Signed-off-by: Juan Quintela >>> Message-ID: <20230712190742.22294-7-faro...@suse.de> > >>> +static void file_offset_finish_hook(QTestState *from, QTestState *to, >>> +void *opaque) >>> +{ >>> +#if defined(__linux__) >>> +g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, >>> FILE_TEST_FILENAME); >>> +size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); >>> +uintptr_t *addr, *p; >>> +int fd; >>> + >>> +fd = open(path, O_RDONLY); >>> +g_assert(fd != -1); >>> +addr = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 0); >>> +g_assert(addr != MAP_FAILED); >>> + >>> +/* >>> + * Ensure the skipped offset contains zeros and the migration >>> + * stream starts at the right place. >>> + */ >>> +p = addr; >>> + while (p < addr + FILE_TEST_OFFSET / sizeof(uintptr_t)) { >>> +g_assert(*p == 0); >>> +p++; >>> +} >>> +g_assert_cmpint(cpu_to_be32(*p), ==, QEMU_VM_FILE_MAGIC); >> >> This truncates to 32-bits, so it breaks on a BE host. We need this: >> >> -->8-- >> From ea0c2d1c988add48d9754891a9fc7f6854a9718a Mon Sep 17 00:00:00 2001 >> From: Fabiano Rosas >> Date: Mon, 16 Oct 2023 15:21:49 -0300 >> Subject: [PATCH] fixup! tests/qtest: migration-test: Add tests for file-based >> migration >> >> --- >> tests/qtest/migration-test.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c >> index da02b6d692..e1c110537b 100644 >> --- a/tests/qtest/migration-test.c >> +++ b/tests/qtest/migration-test.c >> @@ -1966,7 +1966,7 @@ static void file_offset_finish_hook(QTestState *from, >> QTestState *to, >> g_assert(*p == 0); >> p++; >> } >> -g_assert_cmpint(cpu_to_be32(*p), ==, QEMU_VM_FILE_MAGIC); >> +g_assert_cmpint(cpu_to_be64(*p) >> 32, ==, QEMU_VM_FILE_MAGIC); >> >> munmap(addr, size); >> close(fd); > > I am resubmitting with this change. > > But I think we need to change this: > >>> +g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, >>> FILE_TEST_FILENAME); >>> +size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); >>> +uintptr_t *addr, *p; > > I think we should change the test so the file is 64 bits on every > architecture. > Then we can cast to void * or uintptr_t as needed. Hm, I don't get what you mean here. What needs to be 64 bits?
Re: [PULL 00/38] Migration 20231016 patches
Stefan Hajnoczi writes: > On Mon, 16 Oct 2023 at 13:13, Fabiano Rosas wrote: >> >> Stefan Hajnoczi writes: >> >> > On Mon, 16 Oct 2023 at 06:11, Juan Quintela wrote: >> >> >> >> The following changes since commit >> >> 63011373ad22c794a013da69663c03f1297a5c56: >> >> >> >> Merge tag 'pull-riscv-to-apply-20231012-1' of >> >> https://github.com/alistair23/qemu into staging (2023-10-12 10:24:44 >> >> -0400) >> >> >> >> are available in the Git repository at: >> >> >> >> https://gitlab.com/juan.quintela/qemu.git >> >> tags/migration-20231016-pull-request >> >> >> >> for you to fetch changes up to f39b0f42753635b0f2d8b00a26d11bb197bf51e2: >> >> >> >> migration/multifd: Clarify Error usage in multifd_channel_connect >> >> (2023-10-16 11:01:33 +0200) >> >> >> >> >> >> Migration Pull request (20231016) >> >> >> >> In this pull request: >> >> - rdma cleanups >> >> - removal of QEMUFileHook >> >> - test for analyze-migration.py >> >> - test for multifd file >> >> - multifd cleanups >> >> - available switchover bandwidth >> >> - lots of cleanups. >> >> >> >> CI: https://gitlab.com/juan.quintela/qemu/-/pipelines/1037878829 >> >> >> >> Please, apply. >> > >> > This CI failure looks migration-related: >> > >> > MALLOC_PERTURB_=96 >> > PYTHON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3 >> > QTEST_QEMU_BINARY=./qemu-system-i386 >> > G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh >> > QTEST_QEMU_IMG=./qemu-img >> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon >> > /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/migration-test >> > --tap -k >> > ― ✀ >> > ― >> > stderr: >> > ** >> > ERROR:../tests/qtest/migration-test.c:1969:file_offset_finish_hook: >> > assertion failed (cpu_to_be32(*p) == QEMU_VM_FILE_MAGIC): (3 == >> > 1363498573) >> >> That's the test for the file: transport which got merged in the last >> PR. I'll look into it. > > I have dropped the 20231016 pull request for now and the tests passed > without it. Maybe there is an interaction with the test you recently > added and this pull request? Sorry, I expressed myself poorly. The test _is_ what is breaking this pull request. The feature was already merged and is working fine. I commented with a fix on the patch that adds the test. [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration https://lore.kernel.org/r/20231016100706.2551-12-quint...@redhat.com
Re: [PULL 11/38] tests/qtest: migration-test: Add tests for file-based migration
Juan Quintela writes: > From: Fabiano Rosas > > Add basic tests for file-based migration. > > Note that we cannot use test_precopy_common because that routine > expects it to be possible to run the migration live. With the file > transport there is no live migration because we must wait for the > source to finish writing the migration data to the file before the > destination can start reading. Add a new migration function > specifically to handle the file migration. > > Reviewed-by: Peter Xu > Reviewed-by: Juan Quintela > Signed-off-by: Fabiano Rosas > Signed-off-by: Juan Quintela > Message-ID: <20230712190742.22294-7-faro...@suse.de> > --- > tests/qtest/migration-test.c | 147 +++ > 1 file changed, 147 insertions(+) > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > index cef5081f8c..da02b6d692 100644 > --- a/tests/qtest/migration-test.c > +++ b/tests/qtest/migration-test.c > @@ -68,6 +68,10 @@ static bool got_dst_resume; > > #define ANALYZE_SCRIPT "scripts/analyze-migration.py" > > +#define QEMU_VM_FILE_MAGIC 0x5145564d > +#define FILE_TEST_FILENAME "migfile" > +#define FILE_TEST_OFFSET 0x1000 > + > #if defined(__linux__) > #include > #include > @@ -884,6 +888,7 @@ static void test_migrate_end(QTestState *from, QTestState > *to, bool test_dest) > cleanup("migsocket"); > cleanup("src_serial"); > cleanup("dest_serial"); > +cleanup(FILE_TEST_FILENAME); > } > > #ifdef CONFIG_GNUTLS > @@ -1667,6 +1672,70 @@ finish: > test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED); > } > > +static void test_file_common(MigrateCommon *args, bool stop_src) > +{ > +QTestState *from, *to; > +void *data_hook = NULL; > +g_autofree char *connect_uri = g_strdup(args->connect_uri); > + > +if (test_migrate_start(, , args->listen_uri, >start)) { > +return; > +} > + > +/* > + * File migration is never live. We can keep the source VM running > + * during migration, but the destination will not be running > + * concurrently. > + */ > +g_assert_false(args->live); > + > +if (args->start_hook) { > +data_hook = args->start_hook(from, to); > +} > + > +migrate_ensure_converge(from); > +wait_for_serial("src_serial"); > + > +if (stop_src) { > +qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}"); > +if (!got_src_stop) { > +qtest_qmp_eventwait(from, "STOP"); > +} > +} > + > +if (args->result == MIG_TEST_QMP_ERROR) { > +migrate_qmp_fail(from, connect_uri, "{}"); > +goto finish; > +} > + > +migrate_qmp(from, connect_uri, "{}"); > +wait_for_migration_complete(from); > + > +/* > + * We need to wait for the source to finish before starting the > + * destination. > + */ > +migrate_incoming_qmp(to, connect_uri, "{}"); > +wait_for_migration_complete(to); > + > +if (stop_src) { > +qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}"); > +} > + > +if (!got_dst_resume) { > +qtest_qmp_eventwait(to, "RESUME"); > +} > + > +wait_for_serial("dest_serial"); > + > +finish: > +if (args->finish_hook) { > +args->finish_hook(from, to, data_hook); > +} > + > +test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED); > +} > + > static void test_precopy_unix_plain(void) > { > g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); > @@ -1862,6 +1931,76 @@ static void test_precopy_unix_compress_nowait(void) > test_precopy_common(); > } > > +static void test_precopy_file(void) > +{ > +g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs, > + FILE_TEST_FILENAME); > +MigrateCommon args = { > +.connect_uri = uri, > +.listen_uri = "defer", > +}; > + > +test_file_common(, true); > +} > + > +static void file_offset_finish_hook(QTestState *from, QTestState *to, > +void *opaque) > +{ > +#if defined(__linux__) > +g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, > FILE_TEST_FILENAME); > +size_t size = FILE_TEST_OFFSET + sizeof(QEMU_VM_FILE_MAGIC); > +uintptr_t *addr, *p; > +int fd; > + > +fd = open(path, O_RDONLY); > +
Re: [PULL 00/38] Migration 20231016 patches
Stefan Hajnoczi writes: > On Mon, 16 Oct 2023 at 06:11, Juan Quintela wrote: >> >> The following changes since commit 63011373ad22c794a013da69663c03f1297a5c56: >> >> Merge tag 'pull-riscv-to-apply-20231012-1' of >> https://github.com/alistair23/qemu into staging (2023-10-12 10:24:44 -0400) >> >> are available in the Git repository at: >> >> https://gitlab.com/juan.quintela/qemu.git >> tags/migration-20231016-pull-request >> >> for you to fetch changes up to f39b0f42753635b0f2d8b00a26d11bb197bf51e2: >> >> migration/multifd: Clarify Error usage in multifd_channel_connect >> (2023-10-16 11:01:33 +0200) >> >> >> Migration Pull request (20231016) >> >> In this pull request: >> - rdma cleanups >> - removal of QEMUFileHook >> - test for analyze-migration.py >> - test for multifd file >> - multifd cleanups >> - available switchover bandwidth >> - lots of cleanups. >> >> CI: https://gitlab.com/juan.quintela/qemu/-/pipelines/1037878829 >> >> Please, apply. > > This CI failure looks migration-related: > > MALLOC_PERTURB_=96 > PYTHON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3 > QTEST_QEMU_BINARY=./qemu-system-i386 > G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh > QTEST_QEMU_IMG=./qemu-img > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon > /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/migration-test > --tap -k > ― ✀ ― > stderr: > ** > ERROR:../tests/qtest/migration-test.c:1969:file_offset_finish_hook: > assertion failed (cpu_to_be32(*p) == QEMU_VM_FILE_MAGIC): (3 == > 1363498573) That's the test for the file: transport which got merged in the last PR. I'll look into it.
Re: [PATCH v4] migration: hold the BQL during setup
Fiona Ebner writes: > This is intended to be a semantic revert of commit 9b09503752 > ("migration: run setup callbacks out of big lock"). There have been so > many changes since that commit (e.g. a new setup callback > dirty_bitmap_save_setup() that also needs to be adapted now), it's > easier to do the revert manually. > > For snapshots, the bdrv_writev_vmstate() function is used during setup > (in QIOChannelBlock backing the QEMUFile), but not holding the BQL > while calling it could lead to an assertion failure. To understand > how, first note the following: Would it make sense to add a GLOBAL_STATE_CODE() annotation to qio_channel_block_writev? > 1. Generated coroutine wrappers for block layer functions spawn the > coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it. > 2. If the host OS switches threads at an inconvenient time, it can > happen that a bottom half scheduled for the main thread's AioContext > is executed as part of a vCPU thread's aio_poll(). > > An example leading to the assertion failure is as follows: > > main thread: > 1. A snapshot-save QMP command gets issued. > 2. snapshot_save_job_bh() is scheduled. > > vCPU thread: > 3. aio_poll() for the main thread's AioContext is called (e.g. when > the guest writes to a pflash device, as part of blk_pwrite which is a > generated coroutine wrapper). > 4. snapshot_save_job_bh() is executed as part of aio_poll(). > 3. qemu_savevm_state() is called. > 4. qemu_mutex_unlock_iothread() is called. Now > qemu_get_current_aio_context() returns 0x0. > 5. bdrv_writev_vmstate() is executed during the usual savevm setup > via qemu_fflush(). But this function is a generated coroutine wrapper, > so it uses AIO_WAIT_WHILE. There, the assertion > assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > will fail. > > To fix it, ensure that the BQL is held during setup. While it would > only be needed for snapshots, adapting migration too avoids additional > logic for conditional locking/unlocking in the setup callbacks. > Writing the header could (in theory) also trigger qemu_fflush() and > thus bdrv_writev_vmstate(), so the locked section also covers the > qemu_savevm_state_header() call, even for migration for consistentcy. > > The section around multifd_send_sync_main() needs to be unlocked to > avoid a deadlock. In particular, the function calls ... the multifd_save_setup() function calls ... otherwise this paragraph makes no sense. > socket_send_channel_create() using multifd_new_send_channel_async() as > a callback and then waits for the callback to signal via the > channels_ready semaphore. The connection happens via > qio_task_run_in_thread(), but the callback is only executed via > qio_task_thread_result() which is scheduled for the main event loop. > Without unlocking the section, the main thread would never get to > process the task result and the callback meaning there would be no > signal via the channels_ready semaphore. > > The comment in ram_init_bitmaps() was introduced by 4987783400 > ("migration: fix incorrect memory_global_dirty_log_start outside BQL") > and is removed, because it referred to the qemu_mutex_lock_iothread() > call. > > Signed-off-by: Fiona Ebner Thanks for taking the time to explain stuff in the commit message. I dislike having unnecessary dependencies on the BQL throughout the migration code, but I see people preferred that over conditional locking in the previous versions, so in the name of consensus: Reviewed-by: Fabiano Rosas
[PATCH RESEND 0/2] block/qapi: Dead code cleanup
Hi, I'm resending a couple of already reviewed patches that were part of a larger series[1]. Thanks 1- https://lore.kernel.org/r/20230609201910.12100-1-faro...@suse.de Fabiano Rosas (2): block: Remove bdrv_query_block_node_info block: Remove unnecessary variable in bdrv_block_device_info block/qapi.c | 32 ++-- include/block/qapi.h | 3 --- 2 files changed, 2 insertions(+), 33 deletions(-) -- 2.35.3
[PATCH RESEND 1/2] block: Remove bdrv_query_block_node_info
The last call site of this function has been removed by commit c04d0ab026 ("qemu-img: Let info print block graph"). Reviewed-by: Claudio Fontana Signed-off-by: Fabiano Rosas --- block/qapi.c | 27 --- include/block/qapi.h | 3 --- 2 files changed, 30 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index f34f95e0ef..79bf80c503 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -309,33 +309,6 @@ out: aio_context_release(bdrv_get_aio_context(bs)); } -/** - * bdrv_query_block_node_info: - * @bs: block node to examine - * @p_info: location to store node information - * @errp: location to store error information - * - * Store image information about @bs in @p_info. - * - * @p_info will be set only on success. On error, store error in @errp. - */ -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp) -{ -BlockNodeInfo *info; -ERRP_GUARD(); - -info = g_new0(BlockNodeInfo, 1); -bdrv_do_query_node_info(bs, info, errp); -if (*errp) { -qapi_free_BlockNodeInfo(info); -return; -} - -*p_info = info; -} - /** * bdrv_query_image_info: * @bs: block node to examine diff --git a/include/block/qapi.h b/include/block/qapi.h index 18d48ddb70..8663971c58 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -36,9 +36,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, int bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, Error **errp); -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp); void bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, -- 2.35.3
[PATCH RESEND 2/2] block: Remove unnecessary variable in bdrv_block_device_info
The commit 5d8813593f ("block/qapi: Let bdrv_query_image_info() recurse") removed the loop where we set the 'bs0' variable, so now it is just the same as 'bs'. Signed-off-by: Fabiano Rosas Reviewed-by: Philippe Mathieu-Daudé --- block/qapi.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 79bf80c503..1cbb0935ff 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -48,7 +48,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, { ImageInfo **p_image_info; ImageInfo *backing_info; -BlockDriverState *bs0, *backing; +BlockDriverState *backing; BlockDeviceInfo *info; ERRP_GUARD(); @@ -145,7 +145,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, info->write_threshold = bdrv_write_threshold_get(bs); -bs0 = bs; p_image_info = >image; info->backing_file_depth = 0; @@ -153,7 +152,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, * Skip automatically inserted nodes that the user isn't aware of for * query-block (blk != NULL), but not for query-named-block-nodes */ -bdrv_query_image_info(bs0, p_image_info, flat, blk != NULL, errp); +bdrv_query_image_info(bs, p_image_info, flat, blk != NULL, errp); if (*errp) { qapi_free_BlockDeviceInfo(info); return NULL; -- 2.35.3
Re: [PULL v3 1/5] block-migration: Ensure we don't crash during migration cleanup
Michael Tokarev writes: > 30.08.2023 14:49, Stefan Hajnoczi wrote: >> From: Fabiano Rosas >> >> We can fail the blk_insert_bs() at init_blk_migration(), leaving the >> BlkMigDevState without a dirty_bitmap and BlockDriverState. Account >> for the possibly missing elements when doing cleanup. >> >> Fix the following crashes: >> >> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. >> 0x55ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at >> ../block/dirty-bitmap.c:359 >> 359 BlockDriverState *bs = bitmap->bs; >> #0 0x55ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at >> ../block/dirty-bitmap.c:359 >> #1 0x55bba331 in unset_dirty_tracking () at >> ../migration/block.c:371 >> #2 0x55bbad98 in block_migration_cleanup_bmds () at >> ../migration/block.c:681 >> >> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. >> 0x55e971ff in bdrv_op_unblock (bs=0x0, >> op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 >> 7073QLIST_FOREACH_SAFE(blocker, >op_blockers[op], list, next) { >> #0 0x55e971ff in bdrv_op_unblock (bs=0x0, >> op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 >> #1 0x55e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at >> ../block.c:7095 >> #2 0x55bbae13 in block_migration_cleanup_bmds () at >> ../migration/block.c:690 > > This smells like -stable material, is it not? > (applies to 7.2, 8.0 and 8.1). Yes, I agree.
Re: [PATCH 2/2] io: follow coroutine AioContext in qio_channel_yield()
Stefan Hajnoczi writes: > @@ -2089,10 +2088,6 @@ static void nbd_attach_aio_context(BlockDriverState > *bs, > * the reconnect_delay_timer cannot be active here. > */ > assert(!s->reconnect_delay_timer); > - > -if (s->ioc) { > -qio_channel_attach_aio_context(s->ioc, new_context); > -} > } > > static void nbd_detach_aio_context(BlockDriverState *bs) > @@ -2101,10 +2096,6 @@ static void nbd_detach_aio_context(BlockDriverState > *bs) > > assert(!s->open_timer); > assert(!s->reconnect_delay_timer); > - > -if (s->ioc) { > -qio_channel_detach_aio_context(s->ioc); > -} > } The whole attach/detach functions could go away. > > static BlockDriver bdrv_nbd = { > diff --git a/io/channel-command.c b/io/channel-command.c > index 7ed726c802..1f61026222 100644 > --- a/io/channel-command.c > +++ b/io/channel-command.c > @@ -331,14 +331,21 @@ static int qio_channel_command_close(QIOChannel *ioc, > > > static void qio_channel_command_set_aio_fd_handler(QIOChannel *ioc, > - AioContext *ctx, > + AioContext *read_ctx, > IOHandler *io_read, > + AioContext *write_ctx, > IOHandler *io_write, > void *opaque) > { > QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc); > -aio_set_fd_handler(ctx, cioc->readfd, io_read, NULL, NULL, NULL, opaque); > -aio_set_fd_handler(ctx, cioc->writefd, NULL, io_write, NULL, NULL, > opaque); > +if (read_ctx) { > +aio_set_fd_handler(read_ctx, cioc->readfd, io_read, NULL, > +NULL, NULL, opaque); > +} > +if (write_ctx) { > +aio_set_fd_handler(write_ctx, cioc->writefd, NULL, io_write, > +NULL, NULL, opaque); > +} > } > > ... > diff --git a/nbd/client.c b/nbd/client.c > index 479208d5d9..81877d088d 100644 > --- a/nbd/client.c > +++ b/nbd/client.c > @@ -948,7 +948,7 @@ static int nbd_start_negotiate(AioContext *aio_context, > QIOChannel *ioc, > ioc = *outioc; > if (aio_context) { > qio_channel_set_blocking(ioc, false, NULL); > -qio_channel_attach_aio_context(ioc, aio_context); > +qio_channel_set_follow_coroutine_ctx(ioc, true); This is actually unreachable, aio_context is always NULL here. > } > } else { > error_setg(errp, "Server does not support STARTTLS");
[PATCH] block-migration: Ensure we don't crash during migration cleanup
We can fail the blk_insert_bs() at init_blk_migration(), leaving the BlkMigDevState without a dirty_bitmap and BlockDriverState. Account for the possibly missing elements when doing cleanup. Fix the following crashes: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x55ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 359 BlockDriverState *bs = bitmap->bs; #0 0x55ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 #1 0x55bba331 in unset_dirty_tracking () at ../migration/block.c:371 #2 0x55bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681 Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x55e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 7073QLIST_FOREACH_SAFE(blocker, >op_blockers[op], list, next) { #0 0x55e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 #1 0x55e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095 #2 0x55bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690 Signed-off-by: Fabiano Rosas --- migration/block.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/migration/block.c b/migration/block.c index b9580a6c7e..86c2256a2b 100644 --- a/migration/block.c +++ b/migration/block.c @@ -368,7 +368,9 @@ static void unset_dirty_tracking(void) BlkMigDevState *bmds; QSIMPLEQ_FOREACH(bmds, _mig_state.bmds_list, entry) { -bdrv_release_dirty_bitmap(bmds->dirty_bitmap); +if (bmds->dirty_bitmap) { +bdrv_release_dirty_bitmap(bmds->dirty_bitmap); +} } } @@ -676,13 +678,18 @@ static int64_t get_remaining_dirty(void) static void block_migration_cleanup_bmds(void) { BlkMigDevState *bmds; +BlockDriverState *bs; AioContext *ctx; unset_dirty_tracking(); while ((bmds = QSIMPLEQ_FIRST(_mig_state.bmds_list)) != NULL) { QSIMPLEQ_REMOVE_HEAD(_mig_state.bmds_list, entry); -bdrv_op_unblock_all(blk_bs(bmds->blk), bmds->blocker); + +bs = blk_bs(bmds->blk); +if (bs) { +bdrv_op_unblock_all(bs, bmds->blocker); +} error_free(bmds->blocker); /* Save ctx, because bmds->blk can disappear during blk_unref. */ -- 2.35.3
Re: [PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous
Fabiano Rosas writes: > Hi, > > The major change from the last version is that this time I'm moving > all of the callers of bdrv_get_allocated_file_size() into > coroutines. I had to make some temporary changes to avoid asserts > while not all the callers were converted. > > I tried my best to explain why I think the changes are safe. To avoid > changing too much of the code I added a change that removes the > dependency of qmp_query_block from hmp_nbd_server_start, that way I > don't need to move all of the nbd code into a coroutine as well. > > Based on: > [PATCH v2 00/11] block: Re-enable the graph lock > https://lore.kernel.org/r/20230605085711.21261-1-kw...@redhat.com > > changes: > > - fixed duplicated commit message [Lin] > - clarified why we need to convert info-block [Claudio] > - added my rationale of why the changes are safe [Eric] > - converted all callers to coroutines [Kevin] > - made hmp_nbd_server_start don't depend on qmp_query_block > > CI run: https://gitlab.com/farosas/qemu/-/pipelines/895525156 Ping, this seems to have fallen through the cracks.
[PATCH v2 09/10] block: Convert qmp_query_block() to coroutine_fn
This is another caller of bdrv_get_allocated_file_size() that needs to be converted to a coroutine because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This QMP command is a candidate because it calls bdrv_do_query_node_info(), which in turn calls bdrv_get_allocated_file_size(). We've determined bdrv_do_query_node_info() to be coroutine-safe (see previous commits), so we can just put this QMP command in a coroutine. Since qmp_query_block() now expects to run in a coroutine, its callers need to be converted as well. Convert hmp_info_block(), which calls only coroutine-safe code, including qmp_query_named_block_nodes() which has been converted to coroutine in the previous patches. Now that all callers of bdrv_[co_]block_device_info() are using the coroutine version, a few things happen: - we can return to using bdrv_block_device_info() without a wrapper; - bdrv_get_allocated_file_size() can stop being mixed; - bdrv_co_get_allocated_file_size() needs to be put under the graph lock because it is being called wthout the wrapper; - bdrv_do_query_node_info() doesn't need to acquire the AioContext because it doesn't call aio_poll anymore; Signed-off-by: Fabiano Rosas --- block.c| 2 +- block/monitor/block-hmp-cmds.c | 2 +- block/qapi.c | 18 +- hmp-commands-info.hx | 1 + include/block/block-hmp-cmds.h | 2 +- include/block/block-io.h | 2 +- include/block/qapi.h | 12 qapi/block-core.json | 2 +- 8 files changed, 19 insertions(+), 22 deletions(-) diff --git a/block.c b/block.c index abed744b60..f94cee8930 100644 --- a/block.c +++ b/block.c @@ -6148,7 +6148,7 @@ BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, list = NULL; QTAILQ_FOREACH(bs, _bdrv_states, node_list) { -BlockDeviceInfo *info = bdrv_co_block_device_info(NULL, bs, flat, errp); +BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp); if (!info) { qapi_free_BlockDeviceInfoList(list); return NULL; diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index 26116fe831..1049f9b006 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -742,7 +742,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info, } } -void hmp_info_block(Monitor *mon, const QDict *qdict) +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict) { BlockInfoList *block_list, *info; BlockDeviceInfoList *blockdev_list, *blockdev; diff --git a/block/qapi.c b/block/qapi.c index 20660e15d6..3b4bc0b782 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -41,10 +41,10 @@ #include "qemu/qemu-print.h" #include "sysemu/block-backend.h" -BlockDeviceInfo *coroutine_fn bdrv_co_block_device_info(BlockBackend *blk, -BlockDriverState *bs, -bool flat, -Error **errp) +BlockDeviceInfo *coroutine_fn bdrv_block_device_info(BlockBackend *blk, + BlockDriverState *bs, + bool flat, + Error **errp) { ImageInfo **p_image_info; ImageInfo *backing_info; @@ -235,8 +235,6 @@ static void bdrv_do_query_node_info(BlockDriverState *bs, int ret; Error *err = NULL; -aio_context_acquire(bdrv_get_aio_context(bs)); - size = bdrv_getlength(bs); if (size < 0) { error_setg_errno(errp, -size, "Can't get image size '%s'", @@ -249,7 +247,9 @@ static void bdrv_do_query_node_info(BlockDriverState *bs, info->filename= g_strdup(bs->filename); info->format = g_strdup(bdrv_get_format_name(bs)); info->virtual_size= size; -info->actual_size = bdrv_get_allocated_file_size(bs); +bdrv_graph_co_rdlock(); +info->actual_size = bdrv_co_get_allocated_file_size(bs); +bdrv_graph_co_rdunlock(); info->has_actual_size = info->actual_size >= 0; if (bs->encrypted) { info->encrypted = true; @@ -305,7 +305,7 @@ static void bdrv_do_query_node_info(BlockDriverState *bs, } out: -aio_context_release(bdrv_get_aio_context(bs)); +return; } /** @@ -668,7 +668,7 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level) return s; } -BlockInfoList *qmp_query_block(Error **errp) +BlockInfoList *coroutine_fn qmp_query_block(Error **errp) { BlockInfoList *head = NULL, **p_next = BlockBackend *blk; diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 47d63d26db..996895f417 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -65,6 +65,7 @@ ERST
[PATCH v2 05/10] block: Convert bdrv_query_block_graph_info to coroutine
We're converting callers of bdrv_get_allocated_file_size() to run in coroutines because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This function is a candidate because it calls bdrv_do_query_node_info(), which in turn calls bdrv_get_allocated_file_size(). All the functions called from bdrv_do_query_node_info() onwards are coroutine-safe, either have a coroutine version themselves[1] or are mostly simple code/string manipulation[2]. 1) bdrv_getlength(), bdrv_get_allocated_file_size(), bdrv_get_info(), bdrv_get_specific_info(); 2) bdrv_refresh_filename(), bdrv_get_format_name(), bdrv_get_full_backing_filename(), bdrv_query_snapshot_info_list(); Signed-off-by: Fabiano Rosas --- block/qapi.c | 12 +++- include/block/qapi.h | 6 +- qemu-img.c | 2 -- 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 1cbb0935ff..a2e71edaff 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -375,7 +375,7 @@ fail: } /** - * bdrv_query_block_graph_info: + * bdrv_co_query_block_graph_info: * @bs: root node to start from * @p_info: location to store image information * @errp: location to store error information @@ -384,15 +384,17 @@ fail: * * @p_info will be set only on success. On error, store error in @errp. */ -void bdrv_query_block_graph_info(BlockDriverState *bs, - BlockGraphInfo **p_info, - Error **errp) +void coroutine_fn bdrv_co_query_block_graph_info(BlockDriverState *bs, + BlockGraphInfo **p_info, + Error **errp) { BlockGraphInfo *info; BlockChildInfoList **children_list_tail; BdrvChild *c; ERRP_GUARD(); +assert_bdrv_graph_readable(); + info = g_new0(BlockGraphInfo, 1); bdrv_do_query_node_info(bs, qapi_BlockGraphInfo_base(info), errp); if (*errp) { @@ -408,7 +410,7 @@ void bdrv_query_block_graph_info(BlockDriverState *bs, QAPI_LIST_APPEND(children_list_tail, c_info); c_info->name = g_strdup(c->name); -bdrv_query_block_graph_info(c->bs, _info->info, errp); +bdrv_co_query_block_graph_info(c->bs, _info->info, errp); if (*errp) { goto fail; } diff --git a/include/block/qapi.h b/include/block/qapi.h index 8663971c58..7035bcd1ae 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -25,6 +25,7 @@ #ifndef BLOCK_QAPI_H #define BLOCK_QAPI_H +#include "block/block-common.h" #include "block/graph-lock.h" #include "block/snapshot.h" #include "qapi/qapi-types-block-core.h" @@ -41,7 +42,10 @@ void bdrv_query_image_info(BlockDriverState *bs, bool flat, bool skip_implicit_filters, Error **errp); -void GRAPH_RDLOCK +void coroutine_fn GRAPH_RDLOCK +bdrv_co_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, + Error **errp); +void co_wrapper_bdrv_rdlock bdrv_query_block_graph_info(BlockDriverState *bs, BlockGraphInfo **p_info, Error **errp); diff --git a/qemu-img.c b/qemu-img.c index 27f48051b0..8066286f5e 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2945,9 +2945,7 @@ static BlockGraphInfoList *collect_image_info_list(bool image_opts, * duplicate the backing chain information that we obtain by walking * the chain manually here. */ -bdrv_graph_rdlock_main_loop(); bdrv_query_block_graph_info(bs, , ); -bdrv_graph_rdunlock_main_loop(); if (err) { error_report_err(err); -- 2.35.3
[PATCH v2 01/10] block: Remove bdrv_query_block_node_info
The last call site of this function has been removed by commit c04d0ab026 ("qemu-img: Let info print block graph"). Reviewed-by: Claudio Fontana Signed-off-by: Fabiano Rosas --- block/qapi.c | 27 --- include/block/qapi.h | 3 --- 2 files changed, 30 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index f34f95e0ef..79bf80c503 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -309,33 +309,6 @@ out: aio_context_release(bdrv_get_aio_context(bs)); } -/** - * bdrv_query_block_node_info: - * @bs: block node to examine - * @p_info: location to store node information - * @errp: location to store error information - * - * Store image information about @bs in @p_info. - * - * @p_info will be set only on success. On error, store error in @errp. - */ -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp) -{ -BlockNodeInfo *info; -ERRP_GUARD(); - -info = g_new0(BlockNodeInfo, 1); -bdrv_do_query_node_info(bs, info, errp); -if (*errp) { -qapi_free_BlockNodeInfo(info); -return; -} - -*p_info = info; -} - /** * bdrv_query_image_info: * @bs: block node to examine diff --git a/include/block/qapi.h b/include/block/qapi.h index 18d48ddb70..8663971c58 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -36,9 +36,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, int bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, Error **errp); -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp); void bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, -- 2.35.3
[PATCH v2 06/10] block: Convert bdrv_block_device_info into co_wrapper
We're converting callers of bdrv_get_allocated_file_size() to run in coroutines because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This function is a candidate because it calls bdrv_query_image_info() -> bdrv_do_query_node_info() -> bdrv_get_allocated_file_size(). It is safe to turn this is a coroutine because the code it calls is made up of either simple accessors and string manipulation functions [1] or it has already been determined to be safe [2]. 1) bdrv_refresh_filename(), bdrv_is_read_only(), blk_enable_write_cache(), bdrv_cow_bs(), blk_get_public(), throttle_group_get_name(), bdrv_write_threshold_get(), bdrv_query_dirty_bitmaps(), throttle_group_get_config(), bdrv_filter_or_cow_bs(), bdrv_skip_implicit_filters() 2) bdrv_do_query_node_info() (see previous commit); Signed-off-by: Fabiano Rosas --- block/qapi.c | 8 include/block/qapi.h | 12 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index a2e71edaff..20660e15d6 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -41,10 +41,10 @@ #include "qemu/qemu-print.h" #include "sysemu/block-backend.h" -BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, -BlockDriverState *bs, -bool flat, -Error **errp) +BlockDeviceInfo *coroutine_fn bdrv_co_block_device_info(BlockBackend *blk, +BlockDriverState *bs, +bool flat, +Error **errp) { ImageInfo **p_image_info; ImageInfo *backing_info; diff --git a/include/block/qapi.h b/include/block/qapi.h index 7035bcd1ae..5cb0202791 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -30,10 +30,14 @@ #include "block/snapshot.h" #include "qapi/qapi-types-block-core.h" -BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, -BlockDriverState *bs, -bool flat, -Error **errp); +BlockDeviceInfo *coroutine_fn bdrv_co_block_device_info(BlockBackend *blk, +BlockDriverState *bs, +bool flat, +Error **errp); +BlockDeviceInfo *co_wrapper bdrv_block_device_info(BlockBackend *blk, + BlockDriverState *bs, + bool flat, + Error **errp); int bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, Error **errp); -- 2.35.3
[PATCH v2 07/10] block: Convert qmp_query_named_block_nodes to coroutine
From: Lin Ma We're converting callers of bdrv_get_allocated_file_size() to run in coroutines because that function will be made asynchronous when called (indirectly) from the QMP dispatcher. This QMP command is a candidate because it indirectly calls bdrv_get_allocated_file_size() through bdrv_block_device_info() -> bdrv_query_image_info() -> bdrv_query_image_info(). The previous patches have determined that bdrv_query_image_info() and bdrv_do_query_node_info() are coroutine-safe so we can just make the QMP command run in a coroutine. Signed-off-by: Lin Ma Signed-off-by: Fabiano Rosas --- block.c | 2 +- blockdev.c | 6 +++--- qapi/block-core.json | 3 ++- 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block.c b/block.c index f94cee8930..abed744b60 100644 --- a/block.c +++ b/block.c @@ -6148,7 +6148,7 @@ BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, list = NULL; QTAILQ_FOREACH(bs, _bdrv_states, node_list) { -BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp); +BlockDeviceInfo *info = bdrv_co_block_device_info(NULL, bs, flat, errp); if (!info) { qapi_free_BlockDeviceInfoList(list); return NULL; diff --git a/blockdev.c b/blockdev.c index e6eba61484..8b5f7d06c8 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2818,9 +2818,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp) blockdev_do_action(, errp); } -BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat, - bool flat, - Error **errp) +BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat, + bool flat, + Error **errp) { bool return_flat = has_flat && flat; diff --git a/qapi/block-core.json b/qapi/block-core.json index 5dd5f7e4b0..9d4c92f2c9 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1972,7 +1972,8 @@ { 'command': 'query-named-block-nodes', 'returns': [ 'BlockDeviceInfo' ], 'data': { '*flat': 'bool' }, - 'allow-preconfig': true } + 'allow-preconfig': true, + 'coroutine': true} ## # @XDbgBlockGraphNodeType: -- 2.35.3
[PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous
Hi, The major change from the last version is that this time I'm moving all of the callers of bdrv_get_allocated_file_size() into coroutines. I had to make some temporary changes to avoid asserts while not all the callers were converted. I tried my best to explain why I think the changes are safe. To avoid changing too much of the code I added a change that removes the dependency of qmp_query_block from hmp_nbd_server_start, that way I don't need to move all of the nbd code into a coroutine as well. Based on: [PATCH v2 00/11] block: Re-enable the graph lock https://lore.kernel.org/r/20230605085711.21261-1-kw...@redhat.com changes: - fixed duplicated commit message [Lin] - clarified why we need to convert info-block [Claudio] - added my rationale of why the changes are safe [Eric] - converted all callers to coroutines [Kevin] - made hmp_nbd_server_start don't depend on qmp_query_block CI run: https://gitlab.com/farosas/qemu/-/pipelines/895525156 === v1: https://lore.kernel.org/r/20230523213903.18418-1-faro...@suse.de As discussed in another thread [1], here's an RFC addressing a VCPU softlockup encountered when issuing QMP commands that target a disk placed on NFS. Since QMP commands happen with the qemu_global_mutex locked, any command that takes too long to finish will block other threads waiting to take the global mutex. One such thread could be a VCPU thread going out of the guest to handle IO. This is the case when issuing the QMP command query-block, which eventually calls raw_co_get_allocated_file_size(). This function makes an 'fstat' call that has been observed to take a long time (seconds) over NFS. NFS latency issues aside, we can improve the situation by not blocking VCPU threads while the command is running. Move the 'fstat' call into the thread-pool and make the necessary adaptations to ensure raw_co_get_allocated_file_size runs in a coroutine in the block driver aio_context. 1- Question about QMP and BQL https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html Fabiano Rosas (8): block: Remove bdrv_query_block_node_info block: Remove unnecessary variable in bdrv_block_device_info block: Allow the wrapper script to see functions declared in qapi.h block: Temporarily mark bdrv_co_get_allocated_file_size as mixed block: Convert bdrv_query_block_graph_info to coroutine block: Convert bdrv_block_device_info into co_wrapper block: Don't query all block devices at hmp_nbd_server_start block: Convert qmp_query_block() to coroutine_fn João Silva (1): block: Add a thread-pool version of fstat Lin Ma (1): block: Convert qmp_query_named_block_nodes to coroutine block/file-posix.c | 40 +-- block/meson.build | 1 + block/monitor/block-hmp-cmds.c | 22 ++- block/qapi.c | 62 +- blockdev.c | 6 +-- hmp-commands-info.hx | 1 + include/block/block-hmp-cmds.h | 2 +- include/block/qapi.h | 17 include/block/raw-aio.h| 4 +- qapi/block-core.json | 5 ++- qemu-img.c | 2 - scripts/block-coroutine-wrapper.py | 1 + 12 files changed, 90 insertions(+), 73 deletions(-) -- 2.35.3
[PATCH v2 04/10] block: Temporarily mark bdrv_co_get_allocated_file_size as mixed
Some callers of this function are about to be converted to run in coroutines, so allow it to be executed both inside and outside a coroutine while we convert all the callers. This will be reverted once all callers of bdrv_do_query_node_info run in a coroutine. Signed-off-by: Fabiano Rosas Reviewed-by: Eric Blake --- include/block/block-io.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/block/block-io.h b/include/block/block-io.h index 43af816d75..f31e25cf56 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -87,7 +87,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs); int64_t coroutine_fn GRAPH_RDLOCK bdrv_co_get_allocated_file_size(BlockDriverState *bs); -int64_t co_wrapper_bdrv_rdlock +int64_t co_wrapper_mixed_bdrv_rdlock bdrv_get_allocated_file_size(BlockDriverState *bs); BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts, -- 2.35.3
[PATCH v2 03/10] block: Allow the wrapper script to see functions declared in qapi.h
The following patches will add co_wrapper annotations to functions declared in qapi.h. Add that header to the set of files used by block-coroutine-wrapper.py. Signed-off-by: Fabiano Rosas --- block/meson.build | 1 + scripts/block-coroutine-wrapper.py | 1 + 2 files changed, 2 insertions(+) diff --git a/block/meson.build b/block/meson.build index fb4332bd66..7ad6a396a4 100644 --- a/block/meson.build +++ b/block/meson.build @@ -150,6 +150,7 @@ block_gen_c = custom_target('block-gen.c', '../include/block/dirty-bitmap.h', '../include/block/block_int-io.h', '../include/block/block-global-state.h', + '../include/block/qapi.h', '../include/sysemu/block-backend-global-state.h', '../include/sysemu/block-backend-io.h', 'coroutines.h' diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py index d4a183db61..814b62df26 100644 --- a/scripts/block-coroutine-wrapper.py +++ b/scripts/block-coroutine-wrapper.py @@ -44,6 +44,7 @@ def gen_header(): #include "block/block-gen.h" #include "block/block_int.h" #include "block/dirty-bitmap.h" +#include "block/qapi.h" """ -- 2.35.3
[PATCH v2 08/10] block: Don't query all block devices at hmp_nbd_server_start
We're currently doing a full query-block just to enumerate the devices for qmp_nbd_server_add and then discarding the BlockInfoList afterwards. Alter hmp_nbd_server_start to instead iterate explicitly over the block_backends list. This allows the removal of the dependency on qmp_query_block from hmp_nbd_server_start. This is desirable because we're about to move qmp_query_block into a coroutine and don't need to change the NBD code at the same time. Signed-off-by: Fabiano Rosas --- block/monitor/block-hmp-cmds.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index ca2599de44..26116fe831 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -394,7 +394,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) bool writable = qdict_get_try_bool(qdict, "writable", false); bool all = qdict_get_try_bool(qdict, "all", false); Error *local_err = NULL; -BlockInfoList *block_list, *info; +BlockBackend *blk; SocketAddress *addr; NbdServerAddOptions export; @@ -419,18 +419,24 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) return; } -/* Then try adding all block devices. If one fails, close all and +/* + * Then try adding all block devices. If one fails, close all and * exit. */ -block_list = qmp_query_block(NULL); +for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) { +BlockDriverState *bs = blk_bs(blk); -for (info = block_list; info; info = info->next) { -if (!info->value->inserted) { +if (!*blk_name(blk) && !blk_get_attached_dev(blk)) { +continue; +} + +bs = bdrv_skip_implicit_filters(bs); +if (!bs || !bs->drv) { continue; } export = (NbdServerAddOptions) { -.device = info->value->device, +.device = g_strdup(blk_name(blk)), .has_writable = true, .writable = writable, }; @@ -443,8 +449,6 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) } } -qapi_free_BlockInfoList(block_list); - exit: hmp_handle_error(mon, local_err); } -- 2.35.3
[PATCH v2 02/10] block: Remove unnecessary variable in bdrv_block_device_info
The commit 5d8813593f ("block/qapi: Let bdrv_query_image_info() recurse") removed the loop where we set the 'bs0' variable, so now it is just the same as 'bs'. Signed-off-by: Fabiano Rosas --- block/qapi.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index 79bf80c503..1cbb0935ff 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -48,7 +48,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, { ImageInfo **p_image_info; ImageInfo *backing_info; -BlockDriverState *bs0, *backing; +BlockDriverState *backing; BlockDeviceInfo *info; ERRP_GUARD(); @@ -145,7 +145,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, info->write_threshold = bdrv_write_threshold_get(bs); -bs0 = bs; p_image_info = >image; info->backing_file_depth = 0; @@ -153,7 +152,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, * Skip automatically inserted nodes that the user isn't aware of for * query-block (blk != NULL), but not for query-named-block-nodes */ -bdrv_query_image_info(bs0, p_image_info, flat, blk != NULL, errp); +bdrv_query_image_info(bs, p_image_info, flat, blk != NULL, errp); if (*errp) { qapi_free_BlockDeviceInfo(info); return NULL; -- 2.35.3
[PATCH v2 10/10] block: Add a thread-pool version of fstat
From: João Silva The fstat call can take a long time to finish when running over NFS. Add a version of it that runs in the thread pool. Adapt one of its users, raw_co_get_allocated_file size to use the new version. That function is called via QMP under the qemu_global_mutex so it has a large chance of blocking VCPU threads in case it takes too long to finish. Reviewed-by: Claudio Fontana Signed-off-by: João Silva Signed-off-by: Fabiano Rosas --- block/file-posix.c | 40 +--- include/block/raw-aio.h | 4 +++- 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index ac1ed54811..45232dc0f9 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -227,6 +227,9 @@ typedef struct RawPosixAIOData { struct { unsigned long op; } zone_mgmt; +struct { +struct stat *st; +} fstat; }; } RawPosixAIOData; @@ -2614,6 +2617,34 @@ static void raw_close(BlockDriverState *bs) } } +static int handle_aiocb_fstat(void *opaque) +{ +RawPosixAIOData *aiocb = opaque; + +if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) { +return -errno; +} + +return 0; +} + +static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st) +{ +BDRVRawState *s = bs->opaque; +RawPosixAIOData acb; + +acb = (RawPosixAIOData) { +.bs = bs, +.aio_fildes = s->fd, +.aio_type = QEMU_AIO_FSTAT, +.fstat = { +.st = st, +}, +}; + +return raw_thread_pool_submit(handle_aiocb_fstat, ); +} + /** * Truncates the given regular file @fd to @offset and, when growing, fills the * new space according to @prealloc. @@ -2853,11 +2884,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs) static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs) { struct stat st; -BDRVRawState *s = bs->opaque; +int ret; -if (fstat(s->fd, ) < 0) { -return -errno; +ret = raw_co_fstat(bs, ); + +if (ret) { +return ret; } + return (int64_t)st.st_blocks * 512; } diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 0f63c2800c..1f2c678461 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -31,6 +31,7 @@ #define QEMU_AIO_ZONE_REPORT 0x0100 #define QEMU_AIO_ZONE_MGMT0x0200 #define QEMU_AIO_ZONE_APPEND 0x0400 +#define QEMU_AIO_FSTAT0x0800 #define QEMU_AIO_TYPE_MASK \ (QEMU_AIO_READ | \ QEMU_AIO_WRITE | \ @@ -42,7 +43,8 @@ QEMU_AIO_TRUNCATE | \ QEMU_AIO_ZONE_REPORT | \ QEMU_AIO_ZONE_MGMT | \ - QEMU_AIO_ZONE_APPEND) + QEMU_AIO_ZONE_APPEND | \ + QEMU_AIO_FSTAT) /* AIO flags */ #define QEMU_AIO_MISALIGNED 0x1000 -- 2.35.3
Re: [PATCH 04/16] qemu-file: Don't call qemu_fflush() for read only files
Juan Quintela writes: > This was the only caller for read only files. So change the test for > an assert in qemu_fflush(). > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 03/16] migration: Use qemu_file_transferred_noflush() for block migration.
Juan Quintela writes: > We only care about the amount of bytes transferred. Flushing is done > by the system somewhere else. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 13/16] qemu-file: Simplify qemu_file_get_error()
Juan Quintela writes: > If we pass a NULL error is the same that returning dirrectly the value. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH 10/16] qemu-file: Remove _noflush from qemu_file_transferred_noflush()
Juan Quintela writes: > qemu_file_transferred() don't exist anymore, so we can reuse the name. > > Signed-off-by: Juan Quintela > --- > migration/qemu-file.h | 4 ++-- > migration/block.c | 4 ++-- > migration/qemu-file.c | 2 +- > migration/savevm.c| 6 +++--- > migration/vmstate.c | 4 ++-- > 5 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/migration/qemu-file.h b/migration/qemu-file.h > index b4fb872018..3575dfa5ff 100644 > --- a/migration/qemu-file.h > +++ b/migration/qemu-file.h > @@ -34,7 +34,7 @@ QEMUFile *qemu_file_new_output(QIOChannel *ioc); > int qemu_fclose(QEMUFile *f); > > /* > - * qemu_file_transferred_noflush: > + * qemu_file_transferred: > * > * As qemu_file_transferred except for writable files, where no flush Docs need updating^ > * is performed and the reported amount will include the size of any > @@ -42,7 +42,7 @@ int qemu_fclose(QEMUFile *f); > * > * Returns: the total bytes transferred and queued > */ > -uint64_t qemu_file_transferred_noflush(QEMUFile *f); > +uint64_t qemu_file_transferred(QEMUFile *f);
Re: [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context
Kevin Wolf writes: > Am 23.05.2023 um 23:39 hat Fabiano Rosas geschrieben: >> We're about to move calls to 'fstat' into the thread-pool to avoid >> blocking VCPU threads should the system call take too long. >> >> To achieve that we first need to make sure none of its callers is >> holding the aio_context lock, otherwise yielding before scheduling the >> aiocb handler would result in a deadlock when the qemu_global_mutex is >> released and another thread tries to acquire the aio_context. >> >> Signed-off-by: Fabiano Rosas >> --- >> block/qapi.c | 22 +- >> 1 file changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/block/qapi.c b/block/qapi.c >> index ae6cd1c2ff..cd197abf1f 100644 >> --- a/block/qapi.c >> +++ b/block/qapi.c >> @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs, >> return 0; >> } >> >> +static int64_t bdrv_get_actual_size(BlockDriverState *bs) >> +{ >> +int64_t size; >> +AioContext *old_ctx = NULL; >> + >> +if (qemu_in_coroutine()) { > > Hm. Why can't we make sure that it always runs in a coroutine? > > Callers: > > * bdrv_query_block_node_info(). This functions seems to be completely > unused, we can remove it. > Ok, I'm already removing it in patch 1. > * bdrv_query_image_info(). Called by bdrv_block_device_info() and itself. > bdrv_block_device_info() could become a co_wrapper after swapping the > first two parameters so that it runs in the AioContext of @bs. > We cannot have bdrv_block_device_info() as co_wrapper because that would create its own coroutine and yielding from that would merely have us waiting at bdrv_poll_co. So it doesn't solve the blocking issue. What would work is to make bdrv_block_device_info() a coroutine_fn (without a wrapper). Its two callers, qmp_query_block() and qmp_query_named_block_nodes() are already being moved into the handler coroutine in this series, so it would be mostly a matter of adding the type annotation. > * bdrv_query_block_graph_info(). Only called by qemu-img. Could become a > co_wrapper_bdrv_rdlock. > This one works ok. >> +aio_context_release(bdrv_get_aio_context(bs)); >> +old_ctx = bdrv_co_enter(bs); > > I think this is the wrong function to do this. The caller should already > make sure that it's in the right AioContext. > The issue here is that bdrv_do_query_node_info() calls bdrv_co_get_allocated_file_size(), which is the function we want to make async and therefore needs to run outside of aio_context_acquire/release. However, bdrv_do_query_node_info() also calls bdrv_refresh_filename(), which is GLOBAL_STATE_CODE and therefore wants to be in the main thread context. So entering the bs AioContext at the caller doesn't work when giving the device its own iothread. >> +} >> + >> +size = bdrv_get_allocated_file_size(bs); >> + >> +if (qemu_in_coroutine() && old_ctx) { >> +bdrv_co_leave(bs, old_ctx); >> +aio_context_acquire(bdrv_get_aio_context(bs)); >> +} >> + >> +return size; >> +} > > Kevin
Re: [RFC PATCH 6/6] block: Add a thread-pool version of fstat
Eric Blake writes: > On Tue, May 23, 2023 at 06:39:03PM -0300, Fabiano Rosas wrote: >> From: João Silva >> >> The fstat call can take a long time to finish when running over >> NFS. Add a version of it that runs in the thread pool. >> >> Adapt one of its users, raw_co_get_allocated_file size to use the new >> version. That function is called via QMP under the qemu_global_mutex >> so it has a large chance of blocking VCPU threads in case it takes too >> long to finish. >> >> Signed-off-by: João Silva >> Signed-off-by: Fabiano Rosas >> --- >> block/file-posix.c | 40 +--- >> include/block/raw-aio.h | 4 +++- >> 2 files changed, 40 insertions(+), 4 deletions(-) > > Should this change occur earlier in the series, before calling > commands are marked with QAPI coroutine flags? Otherwise, you have a > bisection bug, where something marked coroutine can end up hanging > when it calls a blocking syscall in the wrong context without the help > of this patch offloading the syscall into a helper thread. Hmm, I'm not sure. To submit the work to the thread pool we need to be in a coroutine already. If the syscall blocks for too long we'd be trading blocking the coroutine vs. blocking a vcpu thread anyway. I have tested each patch to avoid bisection issues, but maybe it would be warranted to merge both parts into a single patch. Or arrange them in some other way... I'll experiment with it.
Re: [RFC PATCH 3/6] Convert query-block/info_block to coroutine
Eric Blake writes: > On Tue, May 23, 2023 at 06:39:00PM -0300, Fabiano Rosas wrote: >> From: Lin Ma >> >> Sometimes the query-block performs time-consuming I/O(say waiting for >> the fstat of NFS complete), So let's make this QMP handler runs in a >> coroutine. > > Grammar suggestions: > > Sometimes the query-block command performs time-consuming I/O (say > waiting for the fstat of NFS to complete), so let's make this QMP > handler run in a coroutine. > Thanks! >> >> The following patch moves the fstat() into a thread pool. >> >> Signed-off-by: Lin Ma >> Signed-off-by: Fabiano Rosas > >> --- >> +++ b/qapi/block-core.json >> @@ -838,7 +838,7 @@ >> #} >> ## >> { 'command': 'query-block', 'returns': ['BlockInfo'], >> - 'allow-preconfig': true } >> + 'allow-preconfig': true, 'coroutine': true } > > Rereading docs/devel/qapi-code-gen.rst: > > | Coroutine safety can be hard to prove, similar to thread safety. Common > | pitfalls are: > | > | - The global mutex isn't held across ``qemu_coroutine_yield()``, so > | operations that used to assume that they execute atomically may have > | to be more careful to protect against changes in the global state. > | > | - Nested event loops (``AIO_WAIT_WHILE()`` etc.) are problematic in > | coroutine context and can easily lead to deadlocks. They should be > | replaced by yielding and reentering the coroutine when the condition > | becomes false. > | > | Since the command handler may assume coroutine context, any callers > | other than the QMP dispatcher must also call it in coroutine context. > | In particular, HMP commands calling such a QMP command handler must be > | marked ``.coroutine = true`` in hmp-commands.hx. > > I agree that you also need to change the HMP handler, but the commit > message didn't mention that other than in the subject line. The Ok, I'll update the message for v2. > commit message could also do a better job at explaining how you have > audited that merely adding the coroutine marker is safe (ie. that all > of the calls made by query_block are already coroutine safe). While > the intent behind this patch is on the right track, I'm not sure if I > have enough information to say whether it is safe, or if there are > other lurking problems we will need to fix first. Fair point, I've been trusting the tests too much. A closer code audit is necessary indeed.
Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
Lin Ma writes: > The commit title/message are duplicated to previous one, Here should > use "query-named-block-nodes" instead. > > Lin > Ugh, what a blunder, they're even nicely aligned in the git log. I'll fix it in the next version. Thanks!
[RFC PATCH 1/6] block: Remove bdrv_query_block_node_info
The last call site of this function has been removed by commit c04d0ab026 ("qemu-img: Let info print block graph"). Signed-off-by: Fabiano Rosas --- block/qapi.c | 27 --- include/block/qapi.h | 3 --- 2 files changed, 30 deletions(-) diff --git a/block/qapi.c b/block/qapi.c index f34f95e0ef..79bf80c503 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -309,33 +309,6 @@ out: aio_context_release(bdrv_get_aio_context(bs)); } -/** - * bdrv_query_block_node_info: - * @bs: block node to examine - * @p_info: location to store node information - * @errp: location to store error information - * - * Store image information about @bs in @p_info. - * - * @p_info will be set only on success. On error, store error in @errp. - */ -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp) -{ -BlockNodeInfo *info; -ERRP_GUARD(); - -info = g_new0(BlockNodeInfo, 1); -bdrv_do_query_node_info(bs, info, errp); -if (*errp) { -qapi_free_BlockNodeInfo(info); -return; -} - -*p_info = info; -} - /** * bdrv_query_image_info: * @bs: block node to examine diff --git a/include/block/qapi.h b/include/block/qapi.h index 18d48ddb70..8663971c58 100644 --- a/include/block/qapi.h +++ b/include/block/qapi.h @@ -36,9 +36,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, int bdrv_query_snapshot_info_list(BlockDriverState *bs, SnapshotInfoList **p_list, Error **errp); -void bdrv_query_block_node_info(BlockDriverState *bs, -BlockNodeInfo **p_info, -Error **errp); void bdrv_query_image_info(BlockDriverState *bs, ImageInfo **p_info, bool flat, -- 2.35.3
[RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed
Some callers of this function are about to be converted to use coroutines, so allow it to be executed both inside and outside a coroutine. Signed-off-by: Fabiano Rosas --- include/block/block-io.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/block/block-io.h b/include/block/block-io.h index a27e471a87..c1f96faca5 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -87,7 +87,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs); int64_t coroutine_fn GRAPH_RDLOCK bdrv_co_get_allocated_file_size(BlockDriverState *bs); -int64_t co_wrapper_bdrv_rdlock +int64_t co_wrapper_mixed_bdrv_rdlock bdrv_get_allocated_file_size(BlockDriverState *bs); BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts, -- 2.35.3
[RFC PATCH 3/6] Convert query-block/info_block to coroutine
From: Lin Ma Sometimes the query-block performs time-consuming I/O(say waiting for the fstat of NFS complete), So let's make this QMP handler runs in a coroutine. The following patch moves the fstat() into a thread pool. Signed-off-by: Lin Ma Signed-off-by: Fabiano Rosas --- block/monitor/block-hmp-cmds.c | 2 +- block/qapi.c | 2 +- hmp-commands-info.hx | 1 + include/block/block-hmp-cmds.h | 2 +- qapi/block-core.json | 2 +- 5 files changed, 5 insertions(+), 4 deletions(-) diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index ca2599de44..4b704010bc 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -738,7 +738,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info, } } -void hmp_info_block(Monitor *mon, const QDict *qdict) +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict) { BlockInfoList *block_list, *info; BlockDeviceInfoList *blockdev_list, *blockdev; diff --git a/block/qapi.c b/block/qapi.c index 79bf80c503..ae6cd1c2ff 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -667,7 +667,7 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level) return s; } -BlockInfoList *qmp_query_block(Error **errp) +BlockInfoList *coroutine_fn qmp_query_block(Error **errp) { BlockInfoList *head = NULL, **p_next = BlockBackend *blk; diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 47d63d26db..996895f417 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -65,6 +65,7 @@ ERST .help = "show info of one block device or all block devices " "(-n: show named nodes; -v: show details)", .cmd= hmp_info_block, +.coroutine = true, }, SRST diff --git a/include/block/block-hmp-cmds.h b/include/block/block-hmp-cmds.h index 71113cd7ef..6d9152318f 100644 --- a/include/block/block-hmp-cmds.h +++ b/include/block/block-hmp-cmds.h @@ -48,7 +48,7 @@ void hmp_eject(Monitor *mon, const QDict *qdict); void hmp_qemu_io(Monitor *mon, const QDict *qdict); -void hmp_info_block(Monitor *mon, const QDict *qdict); +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict); void hmp_info_blockstats(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_snapshots(Monitor *mon, const QDict *qdict); diff --git a/qapi/block-core.json b/qapi/block-core.json index 98d9116dae..da95fe456c 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -838,7 +838,7 @@ #} ## { 'command': 'query-block', 'returns': ['BlockInfo'], - 'allow-preconfig': true } + 'allow-preconfig': true, 'coroutine': true } ## # @BlockDeviceTimedStats: -- 2.35.3
[RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous
As discussed in another thread [1], here's an RFC addressing a VCPU softlockup encountered when issuing QMP commands that target a disk placed on NFS. Since QMP commands happen with the qemu_global_mutex locked, any command that takes too long to finish will block other threads waiting to take the global mutex. One such thread could be a VCPU thread going out of the guest to handle IO. This is the case when issuing the QMP command query-block, which eventually calls raw_co_get_allocated_file_size(). This function makes an 'fstat' call that has been observed to take a long time (seconds) over NFS. NFS latency issues aside, we can improve the situation by not blocking VCPU threads while the command is running. Move the 'fstat' call into the thread-pool and make the necessary adaptations to ensure raw_co_get_allocated_file_size runs in a coroutine in the block driver aio_context. 1- Question about QMP and BQL https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html CI run: https://gitlab.com/farosas/qemu/-/pipelines/876583685 Fabiano Rosas (3): block: Remove bdrv_query_block_node_info block: Mark bdrv_co_get_allocated_file_size() as mixed block: Allow bdrv_get_allocated_file_size to run in bdrv context João Silva (1): block: Add a thread-pool version of fstat Lin Ma (2): Convert query-block/info_block to coroutine Convert query-block/info_block to coroutine block/file-posix.c | 40 -- block/monitor/block-hmp-cmds.c | 2 +- block/qapi.c | 51 +++--- blockdev.c | 6 ++-- hmp-commands-info.hx | 1 + include/block/block-hmp-cmds.h | 2 +- include/block/block-io.h | 2 +- include/block/qapi.h | 3 -- include/block/raw-aio.h| 4 ++- qapi/block-core.json | 5 ++-- 10 files changed, 72 insertions(+), 44 deletions(-) -- 2.35.3
[RFC PATCH 6/6] block: Add a thread-pool version of fstat
From: João Silva The fstat call can take a long time to finish when running over NFS. Add a version of it that runs in the thread pool. Adapt one of its users, raw_co_get_allocated_file size to use the new version. That function is called via QMP under the qemu_global_mutex so it has a large chance of blocking VCPU threads in case it takes too long to finish. Signed-off-by: João Silva Signed-off-by: Fabiano Rosas --- block/file-posix.c | 40 +--- include/block/raw-aio.h | 4 +++- 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 0ab158efba..0a29a6cc43 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -227,6 +227,9 @@ typedef struct RawPosixAIOData { struct { unsigned long op; } zone_mgmt; +struct { +struct stat *st; +} fstat; }; } RawPosixAIOData; @@ -2644,6 +2647,34 @@ static void raw_close(BlockDriverState *bs) } } +static int handle_aiocb_fstat(void *opaque) +{ +RawPosixAIOData *aiocb = opaque; + +if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) { +return -errno; +} + +return 0; +} + +static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st) +{ +BDRVRawState *s = bs->opaque; +RawPosixAIOData acb; + +acb = (RawPosixAIOData) { +.bs = bs, +.aio_fildes = s->fd, +.aio_type = QEMU_AIO_FSTAT, +.fstat = { +.st = st, +}, +}; + +return raw_thread_pool_submit(handle_aiocb_fstat, ); +} + /** * Truncates the given regular file @fd to @offset and, when growing, fills the * new space according to @prealloc. @@ -2883,11 +2914,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs) static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs) { struct stat st; -BDRVRawState *s = bs->opaque; +int ret; -if (fstat(s->fd, ) < 0) { -return -errno; +ret = raw_co_fstat(bs, ); + +if (ret) { +return ret; } + return (int64_t)st.st_blocks * 512; } diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 0fe85ade77..7dc6d24e21 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -31,6 +31,7 @@ #define QEMU_AIO_ZONE_REPORT 0x0100 #define QEMU_AIO_ZONE_MGMT0x0200 #define QEMU_AIO_ZONE_APPEND 0x0400 +#define QEMU_AIO_FSTAT0x0800 #define QEMU_AIO_TYPE_MASK \ (QEMU_AIO_READ | \ QEMU_AIO_WRITE | \ @@ -42,7 +43,8 @@ QEMU_AIO_TRUNCATE | \ QEMU_AIO_ZONE_REPORT | \ QEMU_AIO_ZONE_MGMT | \ - QEMU_AIO_ZONE_APPEND) + QEMU_AIO_ZONE_APPEND | \ + QEMU_AIO_FSTAT) /* AIO flags */ #define QEMU_AIO_MISALIGNED 0x1000 -- 2.35.3
[RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context
We're about to move calls to 'fstat' into the thread-pool to avoid blocking VCPU threads should the system call take too long. To achieve that we first need to make sure none of its callers is holding the aio_context lock, otherwise yielding before scheduling the aiocb handler would result in a deadlock when the qemu_global_mutex is released and another thread tries to acquire the aio_context. Signed-off-by: Fabiano Rosas --- block/qapi.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/block/qapi.c b/block/qapi.c index ae6cd1c2ff..cd197abf1f 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs, return 0; } +static int64_t bdrv_get_actual_size(BlockDriverState *bs) +{ +int64_t size; +AioContext *old_ctx = NULL; + +if (qemu_in_coroutine()) { +aio_context_release(bdrv_get_aio_context(bs)); +old_ctx = bdrv_co_enter(bs); +} + +size = bdrv_get_allocated_file_size(bs); + +if (qemu_in_coroutine() && old_ctx) { +bdrv_co_leave(bs, old_ctx); +aio_context_acquire(bdrv_get_aio_context(bs)); +} + +return size; +} + /** * Helper function for other query info functions. Store information about @bs * in @info, setting @errp on error. @@ -250,7 +270,7 @@ static void bdrv_do_query_node_info(BlockDriverState *bs, info->filename= g_strdup(bs->filename); info->format = g_strdup(bdrv_get_format_name(bs)); info->virtual_size= size; -info->actual_size = bdrv_get_allocated_file_size(bs); +info->actual_size = bdrv_get_actual_size(bs); info->has_actual_size = info->actual_size >= 0; if (bs->encrypted) { info->encrypted = true; -- 2.35.3
[RFC PATCH 4/6] Convert query-block/info_block to coroutine
From: Lin Ma Sometimes the query-block performs time-consuming I/O(say waiting for the fstat of NFS complete), So let's make this QMP handler runs in a coroutine. The following patch moves the fstat() into a thread pool. Signed-off-by: Lin Ma Signed-off-by: Fabiano Rosas --- blockdev.c | 6 +++--- qapi/block-core.json | 3 ++- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/blockdev.c b/blockdev.c index 5d56b79df4..6412548662 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2804,9 +2804,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp) blockdev_do_action(, errp); } -BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat, - bool flat, - Error **errp) +BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat, + bool flat, + Error **errp) { bool return_flat = has_flat && flat; diff --git a/qapi/block-core.json b/qapi/block-core.json index da95fe456c..0559c38412 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1972,7 +1972,8 @@ { 'command': 'query-named-block-nodes', 'returns': [ 'BlockDeviceInfo' ], 'data': { '*flat': 'bool' }, - 'allow-preconfig': true } + 'allow-preconfig': true, + 'coroutine': true} ## # @XDbgBlockGraphNodeType: -- 2.35.3
Re: [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live
Daniel P. Berrangé writes: > There are 27 pre-copy live migration scenarios being tested. In all of > these we force non-convergance and run for one iteration, then let it > converge and wait for completion during the second (or following) > iterations. At 3 mbps bandwidth limit the first iteration takes a very > long time (~30 seconds). > > While it is important to test the migration passes and convergance > logic, it is overkill to do this for all 27 pre-copy scenarios. The > TLS migration scenarios in particular are merely exercising different > code paths during connection establishment. > > To optimize time taken, switch most of the test scenarios to run > non-live (ie guest CPUs paused) with no bandwidth limits. This gives > a massive speed up for most of the test scenarios. > > For test coverage the following scenarios are unchanged > > * Precopy with UNIX sockets > * Precopy with UNIX sockets and dirty ring tracking > * Precopy with XBZRLE > * Precopy with multifd > > Signed-off-by: Daniel P. Berrangé > --- > tests/qtest/migration-test.c | 60 ++-- > 1 file changed, 50 insertions(+), 10 deletions(-) > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > index 6492ffa7fe..40d0f75480 100644 > --- a/tests/qtest/migration-test.c > +++ b/tests/qtest/migration-test.c > @@ -568,6 +568,9 @@ typedef struct { > MIG_TEST_FAIL_DEST_QUIT_ERR, > } result; > > +/* Whether the guest CPUs should be running during migration */ > +bool live; > + > /* Postcopy specific fields */ > void *postcopy_data; > bool postcopy_preempt; > @@ -1324,8 +1327,6 @@ static void test_precopy_common(MigrateCommon *args) > return; > } > > -migrate_ensure_non_converge(from); > - > if (args->start_hook) { > data_hook = args->start_hook(from, to); > } > @@ -1335,6 +1336,31 @@ static void test_precopy_common(MigrateCommon *args) > wait_for_serial("src_serial"); > } > > +if (args->live) { > +/* > + * Testing live migration, we want to ensure that some > + * memory is re-dirtied after being transferred, so that > + * we exercise logic for dirty page handling. We achieve > + * this with a ridiculosly low bandwidth that guarantees > + * non-convergance. > + */ > +migrate_ensure_non_converge(from); > +} else { > +/* > + * Testing non-live migration, we allow it to run at > + * full speed to ensure short test case duration. > + * For tests expected to fail, we don't need to > + * change anything. > + */ > +if (args->result == MIG_TEST_SUCCEED) { > +qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}"); > +if (!got_stop) { > +qtest_qmp_eventwait(from, "STOP"); > +} > +migrate_ensure_converge(from); > +} > +} > + > if (!args->connect_uri) { > g_autofree char *local_connect_uri = > migrate_get_socket_address(to, "socket-address"); > @@ -1352,19 +1378,29 @@ static void test_precopy_common(MigrateCommon *args) > qtest_set_expected_status(to, EXIT_FAILURE); > } > } else { > -wait_for_migration_pass(from); > +if (args->live) { > +wait_for_migration_pass(from); > > -migrate_ensure_converge(from); > +migrate_ensure_converge(from); > > -/* We do this first, as it has a timeout to stop us > - * hanging forever if migration didn't converge */ > -wait_for_migration_complete(from); > +/* > + * We do this first, as it has a timeout to stop us > + * hanging forever if migration didn't converge > + */ > +wait_for_migration_complete(from); > + > +if (!got_stop) { > +qtest_qmp_eventwait(from, "STOP"); > +} > +} else { > +wait_for_migration_complete(from); > > -if (!got_stop) { > -qtest_qmp_eventwait(from, "STOP"); > +qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}"); I retested and the problem still persists. The issue is with this wait + cont sequence: wait_for_migration_complete(from); qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}"); We wait for the source to finish but by the time qmp_cont executes, the dst is still INMIGRATE, autostart gets set and I never see the RESUME event. When the dst migration finishes the VM gets put in RUN_STATE_PAUSED (at process_incoming_migration_bh): if (!global_state_received() || global_state_get_runstate() == RUN_STATE_RUNNING) { if (autostart) { vm_start(); } else { runstate_set(RUN_STATE_PAUSED); } } else if (migration_incoming_colo_enabled()) { migration_incoming_disable_colo(); vm_start(); }
Re: [PATCH v2 30/43] migration: Create migrate_max_bandwidth() function
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 29/43] migration: Move migrate_postcopy() to options.c
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 28/43] migration: Move migrate_use_tls() to options.c
Juan Quintela writes: > Once there, rename it to migrate_tls() and make it return bool for > consistency. > > Signed-off-by: Juan Quintela > --- > migration/migration.c | 9 - > migration/migration.h | 2 -- > migration/options.c | 16 +++- > migration/options.h | 9 + > migration/tls.c | 3 ++- > 5 files changed, 26 insertions(+), 13 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 2191437b15..bbc9a07fd7 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -2251,15 +2251,6 @@ bool migrate_postcopy(void) > return migrate_postcopy_ram() || migrate_dirty_bitmaps(); > } > > -int migrate_use_tls(void) > -{ > -MigrationState *s; > - > -s = migrate_get_current(); > - > -return s->parameters.tls_creds && *s->parameters.tls_creds; > -} > - > /* migration thread support */ > /* > * Something bad happened to the RP stream, mark an error > diff --git a/migration/migration.h b/migration/migration.h > index 3ae938b19c..2099470e8e 100644 > --- a/migration/migration.h > +++ b/migration/migration.h > @@ -449,8 +449,6 @@ MigrationState *migrate_get_current(void); > > bool migrate_postcopy(void); > > -int migrate_use_tls(void); > - > uint64_t ram_get_total_transferred_pages(void); > > /* Sending on the return path - generic and then for each message type */ > diff --git a/migration/options.c b/migration/options.c > index a111d0d43f..6db221157f 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -204,6 +204,20 @@ bool migrate_zero_copy_send(void) > > return s->capabilities[MIGRATION_CAPABILITY_ZERO_COPY_SEND]; > } > + > +/* pseudo capabilities */ > + > +bool migrate_tls(void) > +{ > +MigrationState *s; > + > +s = migrate_get_current(); > + > +return s->parameters.tls_creds && *s->parameters.tls_creds; > +} > + > + > + > typedef enum WriteTrackingSupport { > WT_SUPPORT_UNKNOWN = 0, > WT_SUPPORT_ABSENT, > @@ -353,7 +367,7 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, > Error **errp) > new_caps[MIGRATION_CAPABILITY_COMPRESS] || > new_caps[MIGRATION_CAPABILITY_XBZRLE] || > migrate_multifd_compression() || > - migrate_use_tls())) { > + migrate_tls())) { > error_setg(errp, > "Zero copy only available for non-compressed non-TLS > multifd migration"); > return false; > diff --git a/migration/options.h b/migration/options.h > index 99f6bbd7a1..c91d5cbef0 100644 > --- a/migration/options.h > +++ b/migration/options.h > @@ -38,6 +38,15 @@ bool migrate_xbzrle(void); > bool migrate_zero_blocks(void); > bool migrate_zero_copy_send(void); > > +/* > + * pseudo capabilities > + * > + * This are functions that are used in a similar way that capabilities > + * check, but they are not a capability. s/This/These/ s/that capabilities/to capabilities/ > + */ > + > +bool migrate_tls(void); > + > /* capabilities helpers */ > > bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp); > diff --git a/migration/tls.c b/migration/tls.c > index 4d2166a209..acd38e0b62 100644 > --- a/migration/tls.c > +++ b/migration/tls.c > @@ -22,6 +22,7 @@ > #include "channel.h" > #include "migration.h" > #include "tls.h" > +#include "options.h" > #include "crypto/tlscreds.h" > #include "qemu/error-report.h" > #include "qapi/error.h" > @@ -165,7 +166,7 @@ void migration_tls_channel_connect(MigrationState *s, > > bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) > { > -if (!migrate_use_tls()) { > +if (!migrate_tls()) { > return false; > }
Re: [PATCH v2 27/43] migration: Create migrate_cpu_throttle_tailslow() function
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 26/43] migration: Create migrate_cpu_throttle_increment() function
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 25/43] migration: Create migrate_cpu_throttle_initial() to option.c
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 24/43] migration: Move migrate_announce_params() to option.c
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas > --- > migration/migration.c | 14 -- > migration/options.c | 19 +++ > 2 files changed, 19 insertions(+), 14 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index dbb89c2e7b..2191437b15 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -954,20 +954,6 @@ MigrationParameters *qmp_query_migrate_parameters(Error > **errp) > return params; > } > > -AnnounceParameters *migrate_announce_params(void) > -{ > -static AnnounceParameters ap; > - > -MigrationState *s = migrate_get_current(); > - > -ap.initial = s->parameters.announce_initial; > -ap.max = s->parameters.announce_max; > -ap.rounds = s->parameters.announce_rounds; > -ap.step = s->parameters.announce_step; > - > -return > -} > - > /* > * Return true if we're already in the middle of a migration > * (i.e. any of the active or setup states) > diff --git a/migration/options.c b/migration/options.c > index 2cb04fbbd1..ed9d2a226f 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -16,6 +16,7 @@ > #include "qapi/qapi-commands-migration.h" > #include "qapi/qmp/qerror.h" > #include "sysemu/runstate.h" > +#include "migration/misc.h" > #include "migration.h" > #include "ram.h" > #include "options.h" > @@ -589,3 +590,21 @@ uint64_t migrate_xbzrle_cache_size(void) > > return s->parameters.xbzrle_cache_size; > } > + > +/* parameters helpers */ > + > +AnnounceParameters *migrate_announce_params(void) > +{ > +static AnnounceParameters ap; > + > +MigrationState *s = migrate_get_current(); > + > +ap.initial = s->parameters.announce_initial; > +ap.max = s->parameters.announce_max; > +ap.rounds = s->parameters.announce_rounds; > +ap.step = s->parameters.announce_step; > + > +return > +} > + > + Extra whitespace here^
Re: [PATCH v2 23/43] migration: Create migrate_max_cpu_throttle()
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 22/43] migration: Create migrate_checkpoint_delay()
Juan Quintela writes: > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas
Re: [PATCH v2 21/43] migration: Create migrate_throttle_trigger_threshold()
Juan Quintela writes: > Signed-off-by: Juan Quintela > --- > migration/options.c | 9 + > migration/options.h | 1 + > migration/ram.c | 3 +-- > 3 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/migration/options.c b/migration/options.c > index 2b6d88b4b9..b9f3815f7e 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -554,6 +554,15 @@ int migrate_multifd_zstd_level(void) > return s->parameters.multifd_zstd_level; > } > > +uint8_t migrate_throttle_trigger_threshold(void) > +{ > +MigrationState *s; > + > +s = migrate_get_current(); > + > +return s->parameters.throttle_trigger_threshold; > +} > + > uint64_t migrate_xbzrle_cache_size(void) > { > MigrationState *s; > diff --git a/migration/options.h b/migration/options.h > index 96d5a8e6e4..aa54443353 100644 > --- a/migration/options.h > +++ b/migration/options.h > @@ -55,6 +55,7 @@ int migrate_multifd_channels(void); > MultiFDCompression migrate_multifd_compression(void); > int migrate_multifd_zlib_level(void); > int migrate_multifd_zstd_level(void); > +uint8_t migrate_throttle_trigger_threshold(void); > uint64_t migrate_xbzrle_cache_size(void); > > #endif > diff --git a/migration/ram.c b/migration/ram.c > index 7f28588dde..68801012ba 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1179,8 +1179,7 @@ static void migration_update_rates(RAMState *rs, > int64_t end_time) > > static void migration_trigger_throttle(RAMState *rs) > { > -MigrationState *s = migrate_get_current(); > -uint64_t threshold = s->parameters.throttle_trigger_threshold; > +uint64_t threshold = migrate_throttle_trigger_threshold(); > uint64_t bytes_xfer_period = > stat64_get(_counters.transferred) - rs->bytes_xfer_prev; > uint64_t bytes_dirty_period = rs->num_dirty_pages_period * > TARGET_PAGE_SIZE; Reviewed-by: Fabiano Rosas
Re: [PATCH v2 13/43] migration: Create migrate_rdma_pin_all() function
Juan Quintela writes: > Signed-off-by: Juan Quintela > Reviewed-by: Vladimir Sementsov-Ogievskiy > --- > migration/options.c | 7 +++ > migration/options.h | 1 + > migration/rdma.c| 6 +++--- > 3 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/migration/options.c b/migration/options.c > index 2003e413da..9c9b8e5863 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -138,6 +138,13 @@ bool migrate_postcopy_ram(void) > return s->capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM]; > } > > +bool migrate_rdma_pin_all(void) > +{ > +MigrationState *s = migrate_get_current(); > + > +return s->capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]; > +} > + > bool migrate_release_ram(void) > { > MigrationState *s; > diff --git a/migration/options.h b/migration/options.h > index 316efd1063..25c002b37a 100644 > --- a/migration/options.h > +++ b/migration/options.h > @@ -30,6 +30,7 @@ bool migrate_pause_before_switchover(void); > bool migrate_postcopy_blocktime(void); > bool migrate_postcopy_preempt(void); > bool migrate_postcopy_ram(void); > +bool migrate_rdma_pin_all(void); > bool migrate_release_ram(void); > bool migrate_return_path(void); > bool migrate_validate_uuid(void); > diff --git a/migration/rdma.c b/migration/rdma.c > index bf55e2f163..3e7b68c482 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -35,6 +35,7 @@ > #include > #include "trace.h" > #include "qom/object.h" > +#include "options.h" > #include > > /* > @@ -4178,8 +4179,7 @@ void rdma_start_outgoing_migration(void *opaque, > goto err; > } > > -ret = qemu_rdma_source_init(rdma, > -s->capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp); > +ret = qemu_rdma_source_init(rdma,migrate_rdma_pin_all(), errp); Missing a space after the comma here. > > if (ret) { > goto err; > @@ -4201,7 +4201,7 @@ void rdma_start_outgoing_migration(void *opaque, > } > > ret = qemu_rdma_source_init(rdma_return_path, > -s->capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp); > +migrate_rdma_pin_all(), errp); > > if (ret) { > goto return_path_err;
Re: [PATCH v2 03/43] migration: Create migration_cap_set()
Juan Quintela writes: > And remove the convoluted use of qmp_migrate_set_capabilities() to > enable disable MIGRATION_CAPABILITY_BLOCK. > > Signed-off-by: Juan Quintela > --- > migration/migration.c | 34 -- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 74f28cdca6..4bf5df4778 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -1913,25 +1913,24 @@ void migrate_set_state(int *state, int old_state, int > new_state) > } > } > > -static MigrationCapabilityStatus *migrate_cap_add(MigrationCapability index, > - bool state) > +static bool migrate_cap_set(int cap, bool value, Error **errp) Just a nit, the commit message says migration_cap_set. Reviewed-by: Fabiano Rosas
[PATCH v3 11/12] tests/qemu-iotests: Require virtio-scsi-pci
Check that virtio-scsi-pci is present in the QEMU build before running the tests. Signed-off-by: Fabiano Rosas Reviewed-by: Thomas Huth --- tests/qemu-iotests/186 | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/qemu-iotests/186 b/tests/qemu-iotests/186 index 072e54e62b..eaf13c7a33 100755 --- a/tests/qemu-iotests/186 +++ b/tests/qemu-iotests/186 @@ -40,6 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file fuse _require_drivers null-co +_require_devices virtio-scsi-pci if [ "$QEMU_DEFAULT_MACHINE" != "pc" ]; then _notrun "Requires a PC machine" -- 2.35.3
[PATCH v2 11/12] tests/qemu-iotests: Require virtio-scsi-pci
Check that virtio-scsi-pci is present in the QEMU build before running the tests. Signed-off-by: Fabiano Rosas Reviewed-by: Thomas Huth --- tests/qemu-iotests/186 | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/qemu-iotests/186 b/tests/qemu-iotests/186 index 072e54e62b..eaf13c7a33 100755 --- a/tests/qemu-iotests/186 +++ b/tests/qemu-iotests/186 @@ -40,6 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file fuse _require_drivers null-co +_require_devices virtio-scsi-pci if [ "$QEMU_DEFAULT_MACHINE" != "pc" ]; then _notrun "Requires a PC machine" -- 2.35.3
Re: random copy-before-write iotest failure
Peter Maydell writes: > This is on ppc64 (big-endian), a random failure > (it was while testing the riscv pullreq, but that doesn't touch > any of the block stuff): > > 616/635 qemu:block / qemu-iotests qcow2 >ERROR > 101.88s exit status 1 > ― ✀ ― > stderr: > --- /home/pm215/qemu/tests/qemu-iotests/tests/copy-before-write.out > +++ > /home/pm215/qemu/build/all/tests/qemu-iotests/scratch/copy-before-write/copy-before-wri > te.out.bad > @@ -1,5 +1,21 @@ > - > +..F. > +== > +FAIL: test_timeout_break_guest > (__main__.TestCbwError.test_timeout_break_guest) > +-- > +Traceback (most recent call last): > + File "/home/pm215/qemu/tests/qemu-iotests/tests/copy-before-write", > line 200, in test_ti > meout_break_guest > +self.assertEqual(log, """\ > +AssertionError: 'wrot[90 chars])\nwrote 524288/524288 bytes at offset > 524288\[151 chars]c)\n' != 'wrot[90 chars])\nwrite failed: Connection > timed out\nread 10[85 chars]c)\n' > + wrote 524288/524288 bytes at offset 0 > + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > ++ write failed: Connection timed out > +- wrote 524288/524288 bytes at offset 524288 > +- 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > + read 1048576/1048576 bytes at offset 0 > + 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > + > + > -- > Ran 4 tests > > -OK > +FAILED (failures=1) > > thanks > -- PMM FWIW, I've seen that one fail on aarch64 with --enable-debug on the configure line.
[PATCH 10/12] tests/qemu-iotests: Require virtio-scsi-pci
Check that virtio-scsi-pci is present in the QEMU build before running the tests. Signed-off-by: Fabiano Rosas --- tests/qemu-iotests/186 | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/qemu-iotests/186 b/tests/qemu-iotests/186 index 072e54e62b..eaf13c7a33 100755 --- a/tests/qemu-iotests/186 +++ b/tests/qemu-iotests/186 @@ -40,6 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file fuse _require_drivers null-co +_require_devices virtio-scsi-pci if [ "$QEMU_DEFAULT_MACHINE" != "pc" ]; then _notrun "Requires a PC machine" -- 2.35.3
Re: [PATCH] block/io: Check for replay-enabled in bdrv_drain_all_begin()
Peter Maydell writes: > In commit da0bd74434 we refactored bdrv_drain_all_begin() to pull out > the non-polling part into bdrv_drain_all_begin_nopoll(). This change > broke record-and-replay, because the "return early if replay enabled" > check is now in the sub-function bdrv_drain_all_begin_nopoll(), and > so it only causes us to return from that function, and not from the > calling bdrv_drain_all_begin(). > > Fix the regression by checking whether replay is enabled in both > functions. > > The breakage and fix can be tested via 'make check-avocado': the > tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_pc > tests/avocado/reverse_debugging.py:ReverseDebugging_AArch64.test_aarch64_virt > tests were both broken by this. > > Fixes: da0bd744344adb1f285 ("block: Factor out bdrv_drain_all_begin_nopoll()") > Signed-off-by: Peter Maydell Tested-by: Fabiano Rosas
Re: [PULL v3 00/50] Block layer patches
Kevin Wolf writes: > The following changes since commit 48804eebd4a327e4b11f902ba80a00876ee53a43: > > Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into > staging (2022-12-15 10:13:46 +) > > are available in the Git repository at: > > https://repo.or.cz/qemu/kevin.git tags/for-upstream > > for you to fetch changes up to 1b3ff9feb942c2ad0b01ac931e99ad451ab0ef39: > > block: GRAPH_RDLOCK for functions only called by co_wrappers (2022-12-15 > 16:08:23 +0100) > > v3: > - Dropped "configure: Enable -Wthread-safety if present" because FreeBSD > has TSA annotations in its pthread locking functions, so we would have > to annotate the use of every lock in QEMU first before we can enable > it. > > v2: > - Changed TSA capability name to "mutex" to work with older clang > versions. The tsan-build CI job succeeds now. > > > Block layer patches > > - Code cleanups around block graph modification > - Simplify drain > - coroutine_fn correctness fixes, including splitting generated > coroutine wrappers into co_wrapper (to be called only from > non-coroutine context) and co_wrapper_mixed (both coroutine and > non-coroutine context) > - Introduce a block graph rwlock > > > Emanuele Giuseppe Esposito (21): > block-io: introduce coroutine_fn duplicates for > bdrv_common_block_status_above callers > block-copy: add coroutine_fn annotations > nbd/server.c: add coroutine_fn annotations > block-backend: replace bdrv_*_above with blk_*_above > block/vmdk: add coroutine_fn annotations > block: avoid duplicating filename string in bdrv_create > block: distinguish between bdrv_create running in coroutine and not > block: bdrv_create_file is a coroutine_fn > block: rename generated_co_wrapper in co_wrapper_mixed > block-coroutine-wrapper.py: introduce co_wrapper > block-coroutine-wrapper.py: support functions without bs arg > block-coroutine-wrapper.py: support also basic return types > block: convert bdrv_create to co_wrapper > block/dirty-bitmap: convert coroutine-only functions to co_wrapper > graph-lock: Implement guard macros > async: Register/unregister aiocontext in graph lock list > block: wrlock in bdrv_replace_child_noperm > block: remove unnecessary assert_bdrv_graph_writable() > block: assert that graph read and writes are performed correctly > block-coroutine-wrapper.py: introduce annotations that take the graph > rdlock > block: use co_wrapper_mixed_bdrv_rdlock in functions taking the rdlock > > Kevin Wolf (24): > qed: Don't yield in bdrv_qed_co_drain_begin() > test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end() > block: Revert .bdrv_drained_begin/end to non-coroutine_fn > block: Remove drained_end_counter > block: Inline bdrv_drain_invoke() > block: Fix locking for bdrv_reopen_queue_child() > block: Drain individual nodes during reopen > block: Don't use subtree drains in bdrv_drop_intermediate() > stream: Replace subtree drain with a single node drain > block: Remove subtree drains > block: Call drain callbacks only once > block: Remove ignore_bds_parents parameter from drain_begin/end. > block: Drop out of coroutine in bdrv_do_drained_begin_quiesce() > block: Don't poll in bdrv_replace_child_noperm() > block: Remove poll parameter from bdrv_parent_drained_begin_single() > block: Factor out bdrv_drain_all_begin_nopoll() Hi, With today's master at c15dc499cc (Merge tag 'pull-misc-20221218' of https://gitlab.com/rth7680/qemu into staging, 2022-12-19), I get a test failure: $ make check-avocado AVOCADO_TESTS=../tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_pc ... Output: qemu-system-x86_64: ../block/block-backend.c:2572: blk_root_drained_poll: Assertion `blk->quiesce_counter' failed. Bisect points to this patch da0bd74434 (block: Factor out bdrv_drain_all_begin_nopoll(), 2022-12-07) Command from avocado logs (paths cut to save space): ./qemu-system-x86_64 -display none -vga none \ -chardev socket,id=mon,path=monitor.sock \ -mon chardev=mon,mode=control -machine pc \ -chardev socket,id=console,path=console.sock,server=on,wait=off \ -serial chardev:console -icount shift=7,rr=record,rrfile=replay.bin,rrsnapshot=init \ -net none -drive file=disk.qcow2,if=none Happens with arm as well: ./qemu-system-aarch64 -display none -vga none \ -chardev socket,id=mon,path=monitor.sock \ -mon chardev=mon,mode=control -machine virt \ -chardev socket,id=console,path=console.sock,server=on,wait=off \ -serial chardev:console -cpu cortex-a53 -icount \ shift=7,rr=record,rrfile=replay.bin,rrsnapshot=init \ -net none -drive file=disk.qcow2,if=none \ -kernel vmlinuz > Import clang-tsa.h >
Re: [PATCH v2 3/3] ui: remove deprecated 'password' option for SPICE
Daniel P. Berrangé writes: > This has been replaced by the 'password-secret' option, > which references a 'secret' object instance. > > Reviewed-by: Markus Armbruster > Signed-off-by: Daniel P. Berrangé Reviewed-by: Fabiano Rosas Just a small detail below. > --- > docs/about/deprecated.rst | 8 > docs/about/removed-features.rst | 7 +++ > qemu-options.hx | 9 + > ui/spice-core.c | 15 --- > 4 files changed, 8 insertions(+), 31 deletions(-) > > diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst > index daf2334040..8fbe7cb5fe 100644 > --- a/docs/about/deprecated.rst > +++ b/docs/about/deprecated.rst > @@ -73,14 +73,6 @@ Input parameters that take a size value should only use a > size suffix > the value is hexadecimal. That is, '0x20M' is deprecated, and should > be written either as '32M' or as '0x200'. > > -``-spice password=string`` (since 6.0) > -'' > - > -This option is insecure because the SPICE password remains visible in > -the process listing. This is replaced by the new ``password-secret`` > -option which lets the password be securely provided on the command > -line using a ``secret`` object instance. > - > ``-smp`` ("parameter=0" SMP configurations) (since 6.2) > ''' > > diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst > index 63df9848fd..2cbb1b7afe 100644 > --- a/docs/about/removed-features.rst > +++ b/docs/about/removed-features.rst > @@ -408,6 +408,13 @@ pcspk-audiodev=``. > > Use ``-device`` instead. > > +``-spice password=string`` (removed in 8.0) > +''' > + > +This optionwas insecure because the SPICE password remained visible in Missing a space here. > +the process listing. This was replaced by the new ``password-secret`` > +option which lets the password be securely provided on the command > +line using a ``secret`` object instance. > > QEMU Machine Protocol (QMP) commands > > diff --git a/qemu-options.hx b/qemu-options.hx > index 58efb58072..847d71e567 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -2132,7 +2132,7 @@ DEF("spice", HAS_ARG, QEMU_OPTION_spice, > " [,tls-channel=[main|display|cursor|inputs|record|playback]]\n" > " > [,plaintext-channel=[main|display|cursor|inputs|record|playback]]\n" > " [,sasl=on|off][,disable-ticketing=on|off]\n" > -" [,password=][,password-secret=]\n" > +" [,password-secret=]\n" > " [,image-compression=[auto_glz|auto_lz|quic|glz|lz|off]]\n" > " [,jpeg-wan-compression=[auto|never|always]]\n" > " [,zlib-glz-wan-compression=[auto|never|always]]\n" > @@ -2158,13 +2158,6 @@ SRST > ``ipv4=on|off``; \ ``ipv6=on|off``; \ ``unix=on|off`` > Force using the specified IP version. > > -``password=`` > -Set the password you need to authenticate. > - > -This option is deprecated and insecure because it leaves the > -password visible in the process listing. Use ``password-secret`` > -instead. > - > ``password-secret=`` > Set the ID of the ``secret`` object containing the password > you need to authenticate. > diff --git a/ui/spice-core.c b/ui/spice-core.c > index 72f8f1681c..76f7c2bc3d 100644 > --- a/ui/spice-core.c > +++ b/ui/spice-core.c > @@ -412,9 +412,6 @@ static QemuOptsList qemu_spice_opts = { > .name = "unix", > .type = QEMU_OPT_BOOL, > #endif > -},{ > -.name = "password", > -.type = QEMU_OPT_STRING, > },{ > .name = "password-secret", > .type = QEMU_OPT_STRING, > @@ -666,20 +663,8 @@ static void qemu_spice_init(void) > } > passwordSecret = qemu_opt_get(opts, "password-secret"); > if (passwordSecret) { > -if (qemu_opt_get(opts, "password")) { > -error_report("'password' option is mutually exclusive with " > - "'password-secret'"); > -exit(1); > -} > password = qcrypto_secret_lookup_as_utf8(passwordSecret, > _fatal); > -} else { > -str = qemu_opt_get(opts, "password"); > -if (str) { > -warn_report("'password' option is deprecated and insecure, " > -"use 'password-secret' instead"); > -password = g_strdup(str); > -} > } > > if (tls_port) {
Re: [PATCH v2 2/3] block: deprecate iSCSI 'password' in favour of 'password-secret'
Daniel P. Berrangé writes: > Support for referencing secret objects was added in > > commit b189346eb1784df95ed6fed610411dbf23d19e1f > Author: Daniel P. Berrangé > Date: Thu Jan 21 14:19:21 2016 + > > iscsi: add support for getting CHAP password via QCryptoSecret API > > The existing 'password' option is overdue for deprecation and > subsequent removal. > > Signed-off-by: Daniel P. Berrangé Reviewed-by: Fabiano Rosas
Re: [PATCH v2 1/3] block: mention 'password-secret' option for -iscsi
Daniel P. Berrangé writes: > The 'password-secret' option was added > > commit b189346eb1784df95ed6fed610411dbf23d19e1f > Author: Daniel P. Berrangé > Date: Thu Jan 21 14:19:21 2016 + > > iscsi: add support for getting CHAP password via QCryptoSecret API > > but was not mentioned in the command line docs > > Reviewed-by: Markus Armbruster > Signed-off-by: Daniel P. Berrangé Reviewed-by: Fabiano Rosas
Re: [PATCH 1/3] block: mention 'password-secret' option for -iscsi
Daniel P. Berrangé writes: > The 'password-secret' option was added > > commit b189346eb1784df95ed6fed610411dbf23d19e1f > Author: Daniel P. Berrangé > Date: Thu Jan 21 14:19:21 2016 + > > iscsi: add support for getting CHAP password via QCryptoSecret API > > but was not mentioned in the command line docs > > Signed-off-by: Daniel P. Berrangé > --- > qemu-options.hx | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 7f99d15b23..055df73306 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1889,7 +1889,7 @@ SRST > ERST > > DEF("iscsi", HAS_ARG, QEMU_OPTION_iscsi, > -"-iscsi [user=user][,password=password]\n" > +"-iscsi [user=user][,password=password],password-secret=secret-id]\n" Loos like you're missing a bracket before the comma. The line below also lacks one at the end. > " [,header-digest=CRC32C|CR32C-NONE|NONE-CRC32C|NONE\n" > " [,initiator-name=initiator-iqn][,id=target-iqn]\n" > " [,timeout=timeout]\n"