qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
the graph lock internally as a writer. Therefore, it can't hold the
reader lock across these calls, it causes deadlocks. Drop the lock
temporarily around the calls.
Signed-off-by: Kevin Wolf
---
block/qcow2.c | 6 ++
1
These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.
Many of them go through blk_*(), which will always take the lock
There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.
Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps
bdrv_unref() is a no_coroutine_fn, so calling it from coroutine context
is invalid. Use bdrv_co_unref() instead.
Signed-off-by: Kevin Wolf
---
tests/unit/test-bdrv-drain.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/unit/test-bdrv-drain.c
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf
---
tests/unit/test-bdrv-drain.c
There are some conditions under which we don't actually need to do
anything for taking a reader lock: Writing the graph is only possible
from the main context while holding the BQL. So if a reader is running
in the main context under the BQL and knows that it won't be interrupted
until the next
When jobs are sleeping, for example to enforce a given rate limit, they
can be reentered early, in particular in order to get paused, to update
the rate limit or to get cancelled.
Before this patch, they behave in this case as if they had fully
completed their rate limiting delay. This means that
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf
---
qemu-img.c | 5 +++--
1 file
Fiona asked why it's correct that bdrv_graph_co_rd_lock/unlock() do
nothing if qemu_in_main_thread() returns true. As far as I can tell,
it's not correct. The coroutine can yield while it believes to have the
lock, and then the main loop can call some code that takes a write lock
without waiting
From: Vladimir Sementsov-Ogievskiy
colo_checkpoint_notify() is mostly used in colo.c. Outside we use it
once when x-checkpoint-delay migration parameter is set. So, let's
simplify the external API to only that function - notify COLO that
parameter was set. This make external API more robust and
From: Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Juan Quintela
Reviewed-by: Philippe Mathieu-Daudé
Reviewed-by: Lukas Straub
Reviewed-by: Zhang Chen
Message-Id: <20230428194928.1426370-2-vsement...@yandex-team.ru>
Signed-off-by: Juan Quintela
---
From: Lukas Straub
This is not required, colo_flush_ram_cache does not run concurrently
with the multifd threads since the cache is only flushed after
everything has been received. But it makes me more comfortable.
This will be used in the next commits to add colo support to multifd.
From: Vladimir Sementsov-Ogievskiy
have_colo_incoming_thread variable is unused. colo_incoming_thread can
be local.
Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Juan Quintela
Reviewed-by: Peter Xu
Reviewed-by: Zhang Chen
Message-Id:
From: Vladimir Sementsov-Ogievskiy
We generally require same set of capabilities on source and target.
Let's require x-colo capability to use COLO on target.
Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Juan Quintela
Reviewed-by: Peter Xu
Reviewed-by: Lukas Straub
Reviewed-by:
From: Vladimir Sementsov-Ogievskiy
COLO is not listed as running state in migrate_is_running(), so, it's
theoretically possible to disable colo capability in COLO state and the
unexpected error in migration_iteration_finish() is reachable.
Let's disallow that in qmp_migrate_set_capabilities.
From: Vladimir Sementsov-Ogievskiy
We don't allow to use x-colo capability when replication is not
configured. So, no reason to build COLO when replication is disabled,
it's unusable in this case.
Note also that the check in migrate_caps_check() is not the only
restriction: some functions in
The following changes since commit caa9cbd566877b34e9abcc04d936116fc5e0ab28:
Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging
(2023-05-10 14:52:03 +0100)
are available in the Git repository at:
https://gitlab.com/juan.quintela/qemu.git
From: Lukas Straub
The overhead of the mutex in non-multifd mode is negligible,
because in that case its just the single thread taking the mutex.
This will be used in the next commits to add colo support to multifd.
Signed-off-by: Lukas Straub
Reviewed-by: Juan Quintela
Message-Id:
From: Lukas Straub
This will be used in the next commits to add colo support to multifd.
Signed-off-by: Lukas Straub
Reviewed-by: Juan Quintela
Message-Id:
<88135197411df1a71d7832962b39abf60faf0021.1683572883.git.lukasstra...@web.de>
Signed-off-by: Juan Quintela
---
migration/multifd.h |
From: Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Juan Quintela
Reviewed-by: Peter Xu
Reviewed-by: Zhang Chen
Message-Id: <20230428194928.1426370-7-vsement...@yandex-team.ru>
Signed-off-by: Juan Quintela
---
migration/migration.c | 11 ++-
On 5/4/23 4:28 AM, Stefano Garzarella wrote:
v2:
- added patch 01 to use monitor_fd_param() in the blkio module
- use monitor_fd_param() to parse the fd like vhost devices [Stefan]
v1:
https://lore.kernel.org/qemu-devel/20230502145050.224615-1-sgarz...@redhat.com/
The virtio-blk-vhost-vdpa
On 5/10/23 13:20, Kevin Wolf wrote:
The following changes since commit b2896c1b09878fd1c4b485b3662f8beecbe0fef4:
Merge tag 'vfio-updates-20230509.0' of
https://gitlab.com/alex.williamson/qemu into staging (2023-05-10 11:20:35 +0100)
are available in the Git repository at:
On 5/10/23 17:21, Vladimir Sementsov-Ogievskiy wrote:
Interesting, I see two 5/6 letters, equal body, but a bit different
headers (the second has empty "Sender")..
for me all is OK
Interesting, I see two 5/6 letters, equal body, but a bit different headers (the second
has empty "Sender")..
On 10.05.23 18:06, Vladimir Sementsov-Ogievskiy wrote:
Hi all!
Let's refactor QMP transactions implementation into new (relatively)
transaction API.
v9:
01: fix leaks
02-03: add r-b
Hi all!
Let's refactor QMP transactions implementation into new (relatively)
transaction API.
v9:
01: fix leaks
02-03: add r-b
04: fix leak, s/Transaction/transaction/
05: new, was part of 06
06: rework of bitmap-add action moved to 05
Vladimir Sementsov-Ogievskiy (6):
blockdev: refactor
Only backup supports GROUPED mode. Make this logic more clear. And
avoid passing extra thing to each action.
Signed-off-by: Vladimir Sementsov-Ogievskiy
---
blockdev.c | 96 +-
1 file changed, 22 insertions(+), 74 deletions(-)
diff --git
We are going to add more block-graph modifying transaction actions,
and block-graph modifying functions are already based on Transaction
API.
Next, we'll need to separately update permissions after several
graph-modifying actions, and this would be simple with help of
Transaction API.
So, now
Let's simplify things:
First, actions generally don't need access to common BlkActionState
structure. The only exclusion are backup actions that need
block_job_txn.
Next, for transaction actions of Transaction API is more native to
allocated state structure in the action itself.
So, do the
Other bitmap related actions use the .bitmap pointer in .abort action,
let's do same here:
1. It helps further refactoring, as bitmap-add is the only bitmap
action that uses state.action in .abort
2. It must be safe: transaction actions rely on the fact that on
.abort() the state is the
Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Kevin Wolf
---
blockdev.c | 9 +++--
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/blockdev.c b/blockdev.c
index f72084d206..f236e5c27e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2367,7 +2367,7 @@ void
Look at qmp_transaction(): dev_list is not obvious name for list of
actions. Let's look at qapi spec, this argument is "actions". Let's
follow the common practice of using same argument names in qapi scheme
and code.
To be honest, rename props to properties for same reason.
Next, we have to
On 5/10/23 15:08, Juan Quintela wrote:
grep " uint;" on my system includes. I know that there are more
creative ways to define it.
/usr/include/ffi-x86_64.h\0278: ffi_arg uint;
Thankfully only a structure member. :-)
/usr/include/sys/types.h\0150:typedef unsigned int uint;
Oof.
Richard Henderson wrote:
> On 5/10/23 13:20, Juan Quintela wrote:
>> Richard Henderson wrote:
>>> On 5/9/23 20:17, Juan Quintela wrote:
The following changes since commit
271477b59e723250f17a7e20f139262057921b6a:
Merge tag 'compression-code-pull-request' of
On 10.05.23 14:47, Kevin Wolf wrote:
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
Let's simplify things:
First, actions generally don't need access to common BlkActionState
structure. The only exclusion are backup actions that need
block_job_txn.
Next, for transaction
This QMP handler runs in a coroutine, so it must use the corresponding
no_co_wrappers instead.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2185688
Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
The following changes since commit b2896c1b09878fd1c4b485b3662f8beecbe0fef4:
Merge tag 'vfio-updates-20230509.0' of
https://gitlab.com/alex.williamson/qemu into staging (2023-05-10 11:20:35 +0100)
are available in the Git repository at:
https://repo.or.cz/qemu/kevin.git tags/for-upstream
Richard Henderson wrote:
> On 5/9/23 20:17, Juan Quintela wrote:
>> The following changes since commit 271477b59e723250f17a7e20f139262057921b6a:
>>Merge tag 'compression-code-pull-request' of
>> https://gitlab.com/juan.quintela/qemu into staging (2023-05-08
>> 20:38:05 +0100)
>> are available
On 5/10/23 13:20, Juan Quintela wrote:
Richard Henderson wrote:
On 5/9/23 20:17, Juan Quintela wrote:
The following changes since commit 271477b59e723250f17a7e20f139262057921b6a:
Merge tag 'compression-code-pull-request' of
https://gitlab.com/juan.quintela/qemu into staging (2023-05-08
From: Vladimir Sementsov-Ogievskiy
Let's add --enable / --disable configure options for these formats,
so that those who don't need them may not build them.
Signed-off-by: Vladimir Sementsov-Ogievskiy
Message-Id: <20230421092758.814122-1-vsement...@yandex-team.ru>
Reviewed-by: Kevin Wolf
This adds GRAPH_RDLOCK annotations to declare that functions accessing
the parent list of a node need to hold a reader lock for the graph. As
it happens, they already do.
Signed-off-by: Kevin Wolf
Message-Id: <20230504115750.54437-13-kw...@redhat.com>
Reviewed-by: Eric Blake
Reviewed-by: Stefan
These functions must not be called in coroutine context, because they
need write access to the graph.
Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id: <20230504115750.54437-4-kw...@redhat.com>
Signed-off-by: Kevin Wolf
---
From: Emanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_debug_event() need to hold a reader lock for the graph.
Unfortunately we cannot use a co_wrapper_bdrv_rdlock (i.e. make the
coroutine wrapper a no_coroutine_fn), because the function is called
GRAPH_RDLOCK_GUARD() and GRAPH_RDLOCK_GUARD_MAINLOOP() only take a
reader lock for the graph, so the correct annotation for them to use is
TSA_ASSERT_SHARED rather than TSA_ASSERT.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Message-Id: <20230504115750.54437-8-kw...@redhat.com>
The only thing nbd_co_flush() does is call nbd_client_co_flush(). Just
use that function directly in the BlockDriver definitions and remove the
wrapper.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id: <20230504115750.54437-10-kw...@redhat.com>
From: Emanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_get_allocated_file_size() need to hold a reader lock for the
graph.
Signed-off-by: Emanuele Giuseppe Esposito
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Message-Id:
From: Emanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_get_info() need to hold a reader lock for the graph.
Signed-off-by: Emanuele Giuseppe Esposito
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
test-bdrv-drain contains a few test cases that are run both in coroutine
and non-coroutine context. Running the entire code including the setup
and shutdown in coroutines is incorrect because graph modifications can
generally not happen in coroutines.
Change the test so that creating and
This adds GRAPH_RDLOCK annotations to declare that functions accessing
the parent list of a node need to hold a reader lock for the graph. As
it happens, they already do.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_query_block_graph_info() need to hold a reader lock for the graph
because it accesses the children list of a node.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_refresh_limits() need to hold a reader lock for the graph because
it accesses the children list of a node.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
From: Emanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of
nbd_co_do_establish_connection() need to hold a reader lock for the
graph.
Signed-off-by: Emanuele Giuseppe Esposito
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_recurse_can_replace() need to hold a reader lock for the graph
because it accesses the children list of a node.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_query_bds_stats() need to hold a reader lock for the graph because
it accesses the children list of a node.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Message-Id:
This tests that trying to resize an image with QMP block_resize doesn't
hang or otherwise fail when the image is attached to a device running in
an iothread.
This is a regression test for the recent fix that changed
qmp_block_resize, which is a coroutine based QMP handler, to avoid
calling
From: Stefan Hajnoczi
There is no need for the AioContext lock in aio_wait_bh_oneshot().
It's easy to remove the lock from existing callers and then switch from
AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED() in aio_wait_bh_oneshot().
Document that the AioContext lock should not be held across
For some functions, it is part of their interface to be called without
holding the graph lock. Add a new macro to document this.
The macro expands to TSA_EXCLUDES(), which is a relatively weak check
because it passes in cases where the compiler just doesn't know if the
lock is held. Function
Drivers were a bit confused about whether .bdrv_open can run in a
coroutine and whether or not it holds a graph lock.
It cannot keep a graph lock from the caller across the whole function
because it both changes the graph (requires a writer lock) and does I/O
(requires a reader lock). Therefore,
From: Paolo Bonzini
After the recent introduction of many new coroutine callbacks,
a couple calls from non-coroutine_fn to coroutine_fn have sneaked
in; fix them.
Signed-off-by: Paolo Bonzini
Message-Id: <20230406101752.242125-1-pbonz...@redhat.com>
Reviewed-by: Kevin Wolf
Signed-off-by:
job_cancel_locked() drops the job list lock temporarily and it may call
aio_poll(). We must assume that the list has changed after this call.
Also, with unlucky timing, it can end up freeing the job during
job_completed_txn_abort_locked(), making the job pointer invalid, too.
For both reasons, we
From: Stefan Hajnoczi
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.
The --enable-debug-graph-lock option is
There is a bdrv_co_getlength() now, which should be used in coroutine
context.
This requires adding GRAPH_RDLOCK to some functions so that this still
compiles with TSA because bdrv_co_getlength() is GRAPH_RDLOCK.
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
Socket paths need to be short to avoid failures. This is why there is a
iotests.sock_dir (defaulting to /tmp) separate from the disk image base
directory.
Make use of it to fix failures in too deeply nested test directories.
Fixes: ab7f7e67a7e7b49964109501dfcde4ec29bae60e
Signed-off-by: Kevin
From: Emanuele Giuseppe Esposito
This adds GRAPH_RDLOCK annotations to declare that callers of amend
callbacks in BlockDriver need to hold a reader lock for the graph.
Signed-off-by: Emanuele Giuseppe Esposito
Signed-off-by: Kevin Wolf
Reviewed-by: Eric Blake
Reviewed-by: Stefan Hajnoczi
From: Eric Blake
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed. This failure to
reactivate is also present
Migration code can call bdrv_activate() in coroutine context, whereas
other callers call it outside of coroutines. As it calls other code that
is not supposed to run in coroutines, standardise on running outside of
coroutines.
This adds a no_co_wrapper to switch to the main loop before calling
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Let's simplify things:
>
> First, actions generally don't need access to common BlkActionState
> structure. The only exclusion are backup actions that need
> block_job_txn.
>
> Next, for transaction actions of Transaction API
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Only backup supports GROUPED mode. Make this logic more clear. And
> avoid passing extra thing to each action.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy
> ---
> blockdev.c | 92
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Signed-off-by: Vladimir Sementsov-Ogievskiy
Reviewed-by: Kevin Wolf
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Look at qmp_transaction(): dev_list is not obvious name for list of
> actions. Let's look at qapi spec, this argument is "actions". Let's
> follow the common practice of using same argument names in qapi scheme
> and code.
>
>
Am 21.04.2023 um 13:53 hat Vladimir Sementsov-Ogievskiy geschrieben:
> We are going to add more block-graph modifying transaction actions,
> and block-graph modifying functions are already based on Transaction
> API.
>
> Next, we'll need to separately update permissions after several
>
On 5/9/23 20:17, Juan Quintela wrote:
The following changes since commit 271477b59e723250f17a7e20f139262057921b6a:
Merge tag 'compression-code-pull-request' of
https://gitlab.com/juan.quintela/qemu into staging (2023-05-08 20:38:05 +0100)
are available in the Git repository at:
Wed, 10 May 2023 00:58:27 +0200 Olaf Hering :
> In my debugging (with v8.0.0) it turned out the three pci_set_word
> causes the domU to hang. In fact, it is just the last one:
>
>pci_set_byte(pci_conf + 0x20, 0x01); /* BMIBA: 20-23h */
>
> It changes the value from 0xc121 to 0x1.
If I
Markus Armbruster writes:
> To recognize a line starting with a section symbol and or tag, we
> first split it at the first space, then examine the part left of the
> space. We can just as well examine the unsplit line, so do that.
>
> Signed-off-by: Markus Armbruster
> ---
>
Peter Xu wrote:
Hi
[Adding Kevin to the party]
> On Fri, May 05, 2023 at 03:46:52PM +0200, Fiona Ebner wrote:
>> To fix it, ensure that the BQL is held during setup. To avoid changing
>> the behavior for migration too, introduce conditionals for the setup
>> callbacks that need the BQL and
74 matches
Mail list logo