Branch: refs/heads/staging
Home: https://github.com/qemu/qemu
Commit: dbf70f0a0388d0ba5bdf9c31ff3b81f77f4a90eb
https://github.com/qemu/qemu/commit/dbf70f0a0388d0ba5bdf9c31ff3b81f77f4a90eb
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/fdmon-io_uring.c
Log Message:
-----------
aio-posix: fix race between io_uring CQE and AioHandler deletion
When an AioHandler is enqueued on ctx->submit_list for removal, the
fill_sq_ring() function will submit an io_uring POLL_REMOVE operation to
cancel the in-flight POLL_ADD operation.
There is a race when another thread enqueues an AioHandler for deletion
on ctx->submit_list when the POLL_ADD CQE has already appeared. In that
case POLL_REMOVE is unnecessary. The code already handled this, but
forgot that the AioHandler itself is still on ctx->submit_list when the
POLL_ADD CQE is being processed. It's unsafe to delete the AioHandler at
that point in time (use-after-free).
Solve this problem by keeping the AioHandler alive but setting a flag so
that it will be deleted by fill_sq_ring() when it runs.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: c31a4457498bb845ffcb70068f976e14a3cadaad
https://github.com/qemu/qemu/commit/c31a4457498bb845ffcb70068f976e14a3cadaad
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/fdmon-io_uring.c
Log Message:
-----------
aio-posix: fix fdmon-io_uring.c timeout stack variable lifetime
io_uring_prep_timeout() stashes a pointer to the timespec struct rather
than copying its fields. That means the struct must live until after the
SQE has been submitted by io_uring_enter(2). add_timeout_sqe() violates
this constraint because the SQE is not submitted within the function.
Inline add_timeout_sqe() into fdmon_io_uring_wait() so that the struct
lives at least as long as io_uring_enter(2).
This fixes random hangs (bogus timeout values) when the kernel loads
undefined timespec struct values from userspace after the original
struct on the stack has been destroyed.
Reported-by: Kevin Wolf <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 5f8741fca51d6984ea5fa7bf73cc6c04e8a98585
https://github.com/qemu/qemu/commit/5f8741fca51d6984ea5fa7bf73cc6c04e8a98585
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/fdmon-io_uring.c
Log Message:
-----------
aio-posix: fix spurious return from ->wait() due to signals
io_uring_enter(2) only returns -EINTR in some cases when interrupted by
a signal. Therefore the while loop in fdmon_io_uring_wait() is
incomplete and can lead to a spurious early return.
Handle the case when a signal interrupts io_uring_enter(2) but the
syscall returns the number of SQEs submitted (that takes priority over
-EINTR).
This patch probably makes little difference for QEMU, but the test suite
relies on the exact pattern of aio_poll() return values, so it's best to
hide this io_uring syscall interface quirk.
Here is the strace of test-aio receiving 3 SIGCONT signals after this
fix has been applied. Notice how the io_uring_enter(2) return value is 1
the first time because an SQE was submitted, but -EINTR the other times:
eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 9
io_uring_enter(7, 1, 0, 0, NULL, 8) = 1
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, 0x7ffe38a46240) = 0
io_uring_enter(7, 1, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 1
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000}
---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = -1 EINTR
(Interrupted system call)
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000}
---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...>
<... io_uring_enter resumed>) = -1 EINTR (Interrupted system call)
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=596096, si_uid=1000}
---
io_uring_enter(7, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 <unfinished ...>
<... io_uring_enter resumed>) = 0
Reported-by: Kevin Wolf <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 511c62a2c6f8c6c9b0ddb235b1bddd6884a78c38
https://github.com/qemu/qemu/commit/511c62a2c6f8c6c9b0ddb235b1bddd6884a78c38
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/aio-posix.c
Log Message:
-----------
aio-posix: keep polling enabled with fdmon-io_uring.c
Commit 816a430c517e ("util/aio: Defer disabling poll mode as long as
possible") kept polling enabled when the event loop timeout is 0. Since
there is no timeout the event loop will continue immediately and the
overhead of disabling and re-enabling polling can be avoided.
fdmon-io_uring.c is unable to take advantage of this optimization
because its ->need_wait() function returns true whenever there are new
io_uring SQEs to submit:
if (timeout || ctx->fdmon_ops->need_wait(ctx)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Polling will be disabled even when timeout == 0.
Extend the optimization to handle the case when need_wait() returns true
and timeout == 0.
Cc: Chao Gao <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 330adf44dcb69598a298c0e2d3b1337b0de5e11e
https://github.com/qemu/qemu/commit/330adf44dcb69598a298c0e2d3b1337b0de5e11e
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/unit/test-nested-aio-poll.c
Log Message:
-----------
tests/unit: skip test-nested-aio-poll with io_uring
test-nested-aio-poll relies on internal details of how fdmon-poll.c
handles AioContext polling. Skip it when other fdmon implementations are
in use.
The reason why fdmon-io_uring.c behaves differently from fdmon-poll.c is
that its fdmon_ops->need_wait() function returns true when
io_uring_enter(2) must be called (e.g. to submit pending SQEs).
AioContext polling is skipped when ->need_wait() returns true, so the
test case will never enter AioContext polling mode with
fdmon-io_uring.c.
Restrict this test to fdmon-poll.c and drop the
aio_context_use_g_source() call since it's no longer necessary.
Note that this test is only built on POSIX systems so it is safe to
include "util/aio-posix.h".
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: ded29e64c6f3a95d37e18823cc90c023aa7af236
https://github.com/qemu/qemu/commit/ded29e64c6f3a95d37e18823cc90c023aa7af236
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M meson.build
M tests/unit/test-aio.c
M util/aio-posix.c
M util/aio-posix.h
M util/async.c
M util/fdmon-epoll.c
M util/fdmon-io_uring.c
M util/fdmon-poll.c
Log Message:
-----------
aio-posix: integrate fdmon into glib event loop
AioContext's glib integration only supports ppoll(2) file descriptor
monitoring. epoll(7) and io_uring(7) disable themselves and switch back
to ppoll(2) when the glib event loop is used. The main loop thread
cannot use epoll(7) or io_uring(7) because it always uses the glib event
loop.
Future QEMU features may require io_uring(7). One example is uring_cmd
support in FUSE exports. Each feature could create its own io_uring(7)
context and integrate it into the event loop, but this is inefficient
due to extra syscalls. It would be more efficient to reuse the
AioContext's existing fdmon-io_uring.c io_uring(7) context because
fdmon-io_uring.c will already be active on systems where Linux io_uring
is available.
In order to keep fdmon-io_uring.c's AioContext operational even when the
glib event loop is used, extend FDMonOps with an API similar to
GSourceFuncs so that file descriptor monitoring can integrate into the
glib event loop.
A quick summary of the GSourceFuncs API:
- prepare() is called each event loop iteration before waiting for file
descriptors and timers.
- check() is called to determine whether events are ready to be
dispatched after waiting.
- dispatch() is called to process events.
More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html
Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and
also implement epoll(7)- and io_uring(7)-specific file descriptor
monitoring code for glib event loops.
Note that it's still faster to use aio_poll() rather than the glib event
loop since glib waits for file descriptor activity with ppoll(2) and
does not support adaptive polling. But at least epoll(7) and io_uring(7)
now work in glib event loops.
Splitting this into multiple commits without temporarily breaking
AioContext proved difficult so this commit makes all the changes. The
next commit will remove the aio_context_use_g_source() API because it is
no longer needed.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
[kwolf: Build fixes; fix AioContext.list_lock use after destroy]
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: d1f42b600a506393dfa99c0fa5ff7820c5e9f131
https://github.com/qemu/qemu/commit/d1f42b600a506393dfa99c0fa5ff7820c5e9f131
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M util/aio-posix.c
M util/aio-win32.c
M util/async.c
Log Message:
-----------
aio: remove aio_context_use_g_source()
There is no need for aio_context_use_g_source() now that epoll(7) and
io_uring(7) file descriptor monitoring works with the glib event loop.
AioContext doesn't need to be notified that GSource is being used.
On hosts with io_uring support this now enables fdmon-io_uring.c by
default, replacing fdmon-poll.c and fdmon-epoll.c. In other words, the
event loop will use io_uring!
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 3769b9abe9e03acc3cbf4a22a8754a2c09482c7d
https://github.com/qemu/qemu/commit/3769b9abe9e03acc3cbf4a22a8754a2c09482c7d
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M util/async.c
Log Message:
-----------
aio: free AioContext when aio_context_new() fails
g_source_destroy() only removes the GSource from the GMainContext it's
attached to, if any. It does not free it.
Use g_source_unref() instead so that the AioContext (which embeds a
GSource) is freed. There is no need to call g_source_destroy() in
aio_context_new() because the GSource isn't attached to a GMainContext
yet.
aio_ctx_finalize() expects everything to be set up already, so introduce
the new ctx->initialized boolean and do nothing when called with
!initialized. This also requires moving aio_context_setup() down after
event_notifier_init() since aio_ctx_finalize() won't release any
resources that aio_context_setup() acquired.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 421dcc8023aa3751763d12aa0e13d8dcbb393b4c
https://github.com/qemu/qemu/commit/421dcc8023aa3751763d12aa0e13d8dcbb393b4c
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M util/aio-posix.c
M util/aio-win32.c
M util/async.c
Log Message:
-----------
aio: add errp argument to aio_context_setup()
When aio_context_new() -> aio_context_setup() fails at startup it
doesn't really matter whether errors are returned to the caller or the
process terminates immediately.
However, it is not acceptable to terminate when hotplugging --object
iothread at runtime. Refactor aio_context_setup() so that errors can be
propagated. The next commit will set errp when fdmon_io_uring_setup()
fails.
Suggested-by: Kevin Wolf <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 59202c98c0d3b3c3fbb5f5efc6b91411d991ec76
https://github.com/qemu/qemu/commit/59202c98c0d3b3c3fbb5f5efc6b91411d991ec76
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/aio-posix.c
M util/aio-posix.h
M util/fdmon-io_uring.c
Log Message:
-----------
aio-posix: gracefully handle io_uring_queue_init() failure
io_uring may not be available at runtime due to system policies (e.g.
the io_uring_disabled sysctl) or creation could fail due to file
descriptor resource limits.
Handle failure scenarios as follows:
If another AioContext already has io_uring, then fail AioContext
creation so that the aio_add_sqe() API is available uniformly from all
QEMU threads. Otherwise fall back to epoll(7) if io_uring is
unavailable.
Notes:
- Update the comment about selecting the fastest fdmon implementation.
At this point it's not about speed anymore, it's about aio_add_sqe()
API availability.
- Uppercase the error message when converting from error_report() to
error_setg_errno() for consistency (but there are instances of
lowercase in the codebase).
- It's easier to move the #ifdefs from aio-posix.h to aio-posix.c.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: a63e41f2a41cdd4a577e50eadc9b1f6d3028ab36
https://github.com/qemu/qemu/commit/a63e41f2a41cdd4a577e50eadc9b1f6d3028ab36
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M util/fdmon-io_uring.c
Log Message:
-----------
aio-posix: unindent fdmon_io_uring_destroy()
Reduce the level of indentation to make further code changes easier to
read.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 87e7a0f4237a076b0fd75795cba8bd9bff28a147
https://github.com/qemu/qemu/commit/87e7a0f4237a076b0fd75795cba8bd9bff28a147
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M util/aio-posix.c
Log Message:
-----------
aio-posix: add fdmon_ops->dispatch()
The ppoll and epoll file descriptor monitoring implementations rely on
the event loop's generic file descriptor, timer, and BH dispatch code to
invoke user callbacks.
The io_uring file descriptor monitoring implementation will need
io_uring-specific dispatch logic for CQE handlers for custom SQEs.
Introduce a new FDMonOps ->dispatch() callback that allows file
descriptor monitoring implementations to invoke user callbacks. The next
patch will use this new callback.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 1eebdab3c37a4a837baa5ca614ae2aca211a1d8d
https://github.com/qemu/qemu/commit/1eebdab3c37a4a837baa5ca614ae2aca211a1d8d
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M include/block/aio.h
M util/aio-posix.c
M util/aio-posix.h
M util/fdmon-io_uring.c
M util/trace-events
Log Message:
-----------
aio-posix: add aio_add_sqe() API for user-defined io_uring requests
Introduce the aio_add_sqe() API for submitting io_uring requests in the
current AioContext. This allows other components in QEMU, like the block
layer, to take advantage of io_uring features without creating their own
io_uring context.
This API supports nested event loops just like file descriptor
monitoring and BHs do. This comes at a complexity cost: CQE callbacks
must be placed on a list so that nested event loops can invoke pending
CQE callbacks from parent event loops. If you're wondering why
CqeHandler exists instead of just a callback function pointer, this is
why.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 047dabef97bd0c4af3c3dc453b19e20345de3602
https://github.com/qemu/qemu/commit/047dabef97bd0c4af3c3dc453b19e20345de3602
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/file-posix.c
M block/io_uring.c
M block/trace-events
M include/block/aio.h
M include/block/raw-aio.h
R stubs/io_uring.c
M stubs/meson.build
M util/async.c
Log Message:
-----------
block/io_uring: use aio_add_sqe()
AioContext has its own io_uring instance for file descriptor monitoring.
The disk I/O io_uring code was developed separately. Originally I
thought the characteristics of file descriptor monitoring and disk I/O
were too different, requiring separate io_uring instances.
Now it has become clear to me that it's feasible to share a single
io_uring instance for file descriptor monitoring and disk I/O. We're not
using io_uring's IOPOLL feature or anything else that would require a
separate instance.
Unify block/io_uring.c and util/fdmon-io_uring.c using the new
aio_add_sqe() API that allows user-defined io_uring sqe submission. Now
block/io_uring.c just needs to submit readv/writev/fsync and most of the
io_uring-specific logic is handled by fdmon-io_uring.c.
There are two immediate advantages:
1. Fewer system calls. There is no need to monitor the disk I/O io_uring
ring fd from the file descriptor monitoring io_uring instance. Disk
I/O completions are now picked up directly. Also, sqes are
accumulated in the sq ring until the end of the event loop iteration
and there are fewer io_uring_enter(2) syscalls.
2. Less code duplication.
Note that error_setg() messages are not supposed to end with
punctuation, so I removed a '.' for the non-io_uring build error
message.
Signed-off-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 684363fa3bbd78556aa24144195319e19c5602e4
https://github.com/qemu/qemu/commit/684363fa3bbd78556aa24144195319e19c5602e4
Author: Stefan Hajnoczi <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/io_uring.c
Log Message:
-----------
block/io_uring: use non-vectored read/write when possible
The io_uring_prep_readv2/writev2() man pages recommend using the
non-vectored read/write operations when possible for performance
reasons.
I didn't measure a significant difference but it doesn't hurt to have
this optimization in place.
Suggested-by: Eric Blake <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 9730b9974dafa594c3303374d2d2d2e47fc8809b
https://github.com/qemu/qemu/commit/9730b9974dafa594c3303374d2d2d2e47fc8809b
Author: Yeqi Fu <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/bochs.c
M block/file-posix.c
M block/file-win32.c
M block/qcow.c
M include/block/nbd.h
Log Message:
-----------
block: replace TABs with space
Bring the block files in line with the QEMU coding style, with spaces
for indentation. This patch partially resolves the issue 371.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/371
Signed-off-by: Yeqi Fu <[email protected]>
Message-ID: <[email protected]>
[thuth: Rebased the patch to the current master branch]
Signed-off-by: Thomas Huth <[email protected]>
Message-ID: <[email protected]>
[kwolf: Fixed up vertical alignemnt]
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 9dbfd4e28dd11a83f54c371fade8d49a63d6dc1e
https://github.com/qemu/qemu/commit/9dbfd4e28dd11a83f54c371fade8d49a63d6dc1e
Author: Wesley Hershberger <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block.c
M tests/qemu-iotests/257
M tests/qemu-iotests/257.out
Log Message:
-----------
block: Drop detach_subchain for bdrv_replace_node
Detaching filters using detach_subchain=true can cause segfaults as
described in #3149.
More specifically, this was observed when executing concurrent
block-stream and query-named-block-nodes. block-stream adds a
copy-on-read filter as the main BDS for the blockjob; that filter was
dropped with detach_subchain=true but not unref'd until the the blockjob
was free'd. Because query-named-block-nodes assumes that a filter will
always have exactly one child, it caused a segfault when it observed the
detached filter. Stacktrace:
0 bdrv_refresh_filename (bs=0x5efed72f8350)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:8082
1 0x00005efea73cf9dc in bdrv_block_device_info
(blk=0x0, bs=0x5efed72f8350, flat=true, errp=0x7ffeb829ebd8)
at block/qapi.c:62
2 0x00005efea7391ed3 in bdrv_named_nodes_list
(flat=<optimized out>, errp=0x7ffeb829ebd8)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/block.c:6275
3 0x00005efea7471993 in qmp_query_named_block_nodes
(has_flat=<optimized out>, flat=<optimized out>, errp=0x7ffeb829ebd8)
at /usr/src/qemu-1:10.1.0+ds-5ubuntu2/b/qemu/blockdev.c:2834
4 qmp_marshal_query_named_block_nodes
(args=<optimized out>, ret=0x7f2b753beec0, errp=0x7f2b753beec8)
at qapi/qapi-commands-block-core.c:553
5 0x00005efea74f03a5 in do_qmp_dispatch_bh (opaque=0x7f2b753beed0)
at qapi/qmp-dispatch.c:128
6 0x00005efea75108e6 in aio_bh_poll (ctx=0x5efed6f3f430)
at util/async.c:219
7 0x00005efea74ffdb2 in aio_dispatch (ctx=0x5efed6f3f430)
at util/aio-posix.c:436
8 0x00005efea7512846 in aio_ctx_dispatch (source=<optimized out>,
callback=<optimized out>,user_data=<optimized out>)
at util/async.c:361
9 0x00007f2b77809bfb in ?? ()
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
10 0x00007f2b77809e70 in g_main_context_dispatch ()
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
11 0x00005efea7517228 in glib_pollfds_poll () at util/main-loop.c:287
12 os_host_main_loop_wait (timeout=0) at util/main-loop.c:310
13 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:589
14 0x00005efea7140482 in qemu_main_loop () at system/runstate.c:905
15 0x00005efea744e4e8 in qemu_default_main (opaque=opaque@entry=0x0)
at system/main.c:50
16 0x00005efea6e76319 in main
(argc=<optimized out>, argv=<optimized out>)
at system/main.c:93
As discussed in [email protected],
a filter should not exist without children in the first place; therefore,
drop the parameter entirely as it is only used for filters.
This is a partial revert of 3108a15cf09865456d499b08fe14e3dbec4ccbb3.
After this change, a blockdev-backup job's copy-before-write filter will
hold references to its children until the filter is unref'd. This causes
an additional flush during bdrv_close, so also update iotest 257.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3149
Suggested-by: Kevin Wolf <[email protected]>
Signed-off-by: Wesley Hershberger <[email protected]>
Reviewed-by: Vladimir Sementsov-Ogievskiy <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 23798d3f885497c1f0a0c062fc889e7a5eff0648
https://github.com/qemu/qemu/commit/23798d3f885497c1f0a0c062fc889e7a5eff0648
Author: Kevin Wolf <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/qemu-iotests/tests/resize-below-raw
M tests/qemu-iotests/tests/resize-below-raw.out
Log Message:
-----------
iotests: Test resizing file node under raw with size/offset
This adds some more tests for using the 'size' and 'offset' options of
raw to the recently added resize-below-raw test.
Signed-off-by: Kevin Wolf <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: f00bcc833790c72c08bc5eed97845fdaa7542507
https://github.com/qemu/qemu/commit/f00bcc833790c72c08bc5eed97845fdaa7542507
Author: Akihiko Odaki <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M qemu-img.c
Log Message:
-----------
qemu-img: Fix amend option parse error handling
qemu_opts_del(opts) dereferences opts->list, which is the old amend_opts
pointer that can be dangling after executing
qemu_opts_append(amend_opts, bs->drv->create_opts) and cause
use-after-free.
Fix the potential use-after-free by moving the qemu_opts_del() call
before the qemu_opts_append() call.
Signed-off-by: Akihiko Odaki <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 71c1a1f18c1127dfa706f972821157c0feac168d
https://github.com/qemu/qemu/commit/71c1a1f18c1127dfa706f972821157c0feac168d
Author: Akihiko Odaki <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/qemu-iotests/meson.build
M tests/qemu-iotests/testrunner.py
Log Message:
-----------
iotests: Run iotests with sanitizers
Commit 2cc4d1c5eab1 ("tests/check-block: Skip iotests when sanitizers
are enabled") changed iotests to skip when sanitizers are enabled.
The rationale is that AddressSanitizer emits warnings and reports leaks,
which results in test breakage. Later, sanitizers that are enabled for
production environments (safe-stack and cfi-icall) were exempted.
However, this approach has a few problems.
- It requires rebuild to disable sanitizers if the existing build has
them enabled.
- It disables other useful non-production sanitizers.
- The exemption of safe-stack and cfi-icall is not correctly
implemented, so qemu-iotests are incorrectly enabled whenever either
safe-stack or cfi-icall is enabled *and*, even if there is another
sanitizer like AddressSanitizer.
To solve these problems, direct AddressSanitizer warnings to separate
files to avoid changing the test results, and selectively disable
leak detection at runtime instead of requiring to disable all
sanitizers at buildtime.
Signed-off-by: Akihiko Odaki <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 31242df6caff67c96c08566ff372f067bfb47c3a
https://github.com/qemu/qemu/commit/31242df6caff67c96c08566ff372f067bfb47c3a
Author: Jean-Louis Dupond <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/qcow2-refcount.c
Log Message:
-----------
qcow2: rename update_refcount_discard to queue_discard
The function just queues discards, and doesn't do any refcount change.
So let's change the function name to align with its function.
Signed-off-by: Jean-Louis Dupond <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 524d5ba8c0b18e8de6530632160bf2aa71b60a21
https://github.com/qemu/qemu/commit/524d5ba8c0b18e8de6530632160bf2aa71b60a21
Author: Jean-Louis Dupond <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/qcow2-cluster.c
M block/qcow2-refcount.c
M block/qcow2.h
Log Message:
-----------
qcow2: put discards in discard queue when discard-no-unref is enabled
When discard-no-unref is enabled, discards are not queued like it
should.
This was broken since discard-no-unref was added.
Add a helper function qcow2_discard_cluster which handles some common
checks and calls the queue_discards function if needed to add the
discard request to the queue.
Signed-off-by: Jean-Louis Dupond <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Eric Blake <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 061b0275c7c22a97927449ecbac365a4a152a67d
https://github.com/qemu/qemu/commit/061b0275c7c22a97927449ecbac365a4a152a67d
Author: Thomas Huth <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/qemu-iotests/184
Log Message:
-----------
tests/qemu-iotests/184: Fix skip message for qemu-img without throttle
If qemu-img does not support throttling, test 184 currently skips
with the message:
not suitable for this image format: raw
But that's wrong, it's not about the image format, it's about the
throttling not being available in qemu-img. Thus fix this by using
_notrun with a proper message instead.
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: f00a45e9ca1ab5d8f157f8f2109173782157dac3
https://github.com/qemu/qemu/commit/f00a45e9ca1ab5d8f157f8f2109173782157dac3
Author: Thomas Huth <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/qemu-iotests/check
Log Message:
-----------
tests/qemu-iotests: Improve the dry run list to speed up thorough testing
When running the tests in thorough mode, e.g. with:
make -j$(nproc) check SPEED=thorough
we currently always get a huge amount of total tests that the test
runner tries to execute (2457 in my case), but a big bunch of them are
only skipped (1099 in my case, meaning that only 1358 got executed).
This happens because we try to run the whole set of iotests for multiple
image formats while a lot of the tests can only run with one certain
format only and thus are marked as SKIP during execution. This is quite a
waste of time during each test run, and also unnecessarily blows up the
displayed list of executed tests in the console output.
Thus let's try to be a little bit smarter: If the "check" script is run
with "-n" and an image format switch (like "-qed") at the same time (which
is what we do for discovering the tests for the meson test runner already),
only report the tests that likely support the given format instead of
providing the whole list of all tests. We can determine whether a test
supports a format or not by looking at the lines in the file that contain
a "supported_fmt" or "unsupported_fmt" statement. This is only heuristics,
of course, but it is good enough for running the iotests via "make
check-block" - I double-checked that the list of executed tests does not
get changed by this patch, it's only the tests that are skipped anyway that
are now not run anymore.
This way the amount of total tests drops from 2457 to 1432 for me, and
the amount of skipped tests drops from 1099 to just 74 (meaning that we
still properly run 1432 - 74 = 1358 tests as we did before).
Signed-off-by: Thomas Huth <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 67685a2331f7c4f1d5bf14f253ff8edf6cac9cbb
https://github.com/qemu/qemu/commit/67685a2331f7c4f1d5bf14f253ff8edf6cac9cbb
Author: Thomas Huth <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M tests/qemu-iotests/meson.build
Log Message:
-----------
tests/qemu-iotest: Add more image formats to the thorough testing
Now that the "check" script is a little bit smarter with providing
a list of tests that are supported for an image format, we can also
add more image formats that can be used for generic block layer
testing. (Note: qcow1 and luks are not added because some tests
there currently fail, and other formats like bochs, cloop, dmg and
vvfat do not work with the generic tests and thus would only get
skipped if we'd tried to add them here)
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 1bd7bfbc2ba3ed767eaff3bd73f598e877b30f28
https://github.com/qemu/qemu/commit/1bd7bfbc2ba3ed767eaff3bd73f598e877b30f28
Author: Eric Blake <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block.c
M block/crypto.c
M block/parallels.c
M block/qcow.c
M block/qcow2.c
M block/qed.c
M block/raw-format.c
M block/vdi.c
M block/vhdx.c
M block/vmdk.c
M block/vpc.c
M include/block/block-global-state.h
Log Message:
-----------
block: Allow drivers to control protocol prefix at creation
This patch is pure refactoring: instead of hard-coding permission to
use a protocol prefix when creating an image, the drivers can now pass
in a parameter, comparable to what they could already do for opening a
pre-existing image. This patch is purely mechanical (all drivers pass
in true for now), but it will enable the next patch to cater to
drivers that want to differ in behavior for the primary image vs. any
secondary images that are opened at the same time as creating the
primary image.
Signed-off-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 2e909d7ca95e6d395bd282a5bbf775c4b4eac10f
https://github.com/qemu/qemu/commit/2e909d7ca95e6d395bd282a5bbf775c4b4eac10f
Author: Eric Blake <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M block/qcow2.c
M block/vmdk.c
Log Message:
-----------
qcow2, vmdk: Restrict creation with secondary file using protocol
Ever since CVE-2024-4467 (see commit 7ead9469 in qemu v9.1.0), we have
intentionally treated the opening of secondary files whose name is
specified in the contents of the primary file, such as a qcow2
data_file, as something that must be a local file and not a protocol
prefix (it is still possible to open a qcow2 file that wraps an NBD
data image by using QMP commands, but that is from the explicit action
of the QMP overriding any string encoded in the qcow2 file). At the
time, we did not prevent the use of protocol prefixes on the secondary
image while creating a qcow2 file, but it results in a qcow2 file that
records an empty string for the data_file, rather than the protocol
passed in during creation:
$ qemu-img create -f raw datastore.raw 2G
$ qemu-nbd -e 0 -t -f raw datastore.raw &
$ qemu-img create -f qcow2 -o data_file=nbd://localhost:10809/ \
datastore_nbd.qcow2 2G
Formatting 'datastore_nbd.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off
compression_type=zlib size=2147483648 data_file=nbd://localhost:10809/
lazy_refcounts=off refcount_bits=16
$ qemu-img info datastore_nbd.qcow2 | grep data
$ qemu-img info datastore_nbd.qcow2 | grep data
image: datastore_nbd.qcow2
data file:
data file raw: false
filename: datastore_nbd.qcow2
And since an empty string was recorded in the file, attempting to open
the image without using QMP to supply the NBD data store fails, with a
somewhat confusing error message:
$ qemu-io -f qcow2 datastore_nbd.qcow2
qemu-io: can't open device datastore_nbd.qcow2: The 'file' block driver
requires a file name
Although the ability to create an image with a convenience reference
to a protocol data file is not a security hole (unlike the case with
open, the image is not untrusted if we are the ones creating it), the
above demo shows that it is still inconsistent. Thus, it makes more
sense if we also insist that image creation rejects a protocol prefix
when using the same syntax. Now, the above attempt produces:
$ qemu-img create -f qcow2 -o data_file=nbd://localhost:10809/ \
datastore_nbd.qcow2 2G
Formatting 'datastore_nbd.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off
compression_type=zlib size=2147483648 data_file=nbd://localhost:10809/
lazy_refcounts=off refcount_bits=16
qemu-img: datastore_nbd.qcow2: Could not create 'nbd://localhost:10809/': No
such file or directory
with datastore_nbd.qcow2 no longer created.
Signed-off-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 909852ba6b4a22fd2b6f9d8b88adb5fc47dfa781
https://github.com/qemu/qemu/commit/909852ba6b4a22fd2b6f9d8b88adb5fc47dfa781
Author: Alberto Garcia <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M qemu-img.c
M tests/qemu-iotests/024
M tests/qemu-iotests/024.out
Log Message:
-----------
qemu-img rebase: don't exceed IO_BUF_SIZE in one operation
During a rebase operation data is copied from the backing chain into
the target image using a loop, and each iteration looks for a
contiguous region of allocated data of at most IO_BUF_SIZE (2 MB).
Once that region is found, and in order to avoid partial writes, its
boundaries are extended so they are aligned to the (sub)clusters of
the target image (see commit 12df580b).
This operation can however result in a region that exceeds the maximum
allowed IO_BUF_SIZE, crashing qemu-img.
This can be easily reproduced when the source image has a smaller
cluster size than the target image:
base <- int <- active
$ qemu-img create -f qcow2 base.qcow2 4M
$ qemu-img create -f qcow2 -F qcow2 -b base.qcow2 -o cluster_size=1M int.qcow2
$ qemu-img create -f qcow2 -F qcow2 -b int.qcow2 -o cluster_size=2M
active.qcow2
$ qemu-io -c "write -P 0xff 1M 2M" int.qcow2
$ qemu-img rebase -F qcow2 -b base.qcow2 active.qcow2
qemu-img: qemu-img.c:4102: img_rebase: Assertion `written + pnum <=
IO_BUF_SIZE' failed.
Aborted
Cc: qemu-stable <[email protected]>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3174
Fixes: 12df580b3b7f ("qemu-img: rebase: avoid unnecessary COW operations")
Signed-off-by: Alberto Garcia <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Signed-off-by: Kevin Wolf <[email protected]>
Commit: 9febfa94b69b7146582c48a868bd2330ac45037f
https://github.com/qemu/qemu/commit/9febfa94b69b7146582c48a868bd2330ac45037f
Author: Richard Henderson <[email protected]>
Date: 2025-11-12 (Wed, 12 Nov 2025)
Changed paths:
M block.c
M block/bochs.c
M block/crypto.c
M block/file-posix.c
M block/file-win32.c
M block/io_uring.c
M block/parallels.c
M block/qcow.c
M block/qcow2-cluster.c
M block/qcow2-refcount.c
M block/qcow2.c
M block/qcow2.h
M block/qed.c
M block/raw-format.c
M block/trace-events
M block/vdi.c
M block/vhdx.c
M block/vmdk.c
M block/vpc.c
M include/block/aio.h
M include/block/block-global-state.h
M include/block/nbd.h
M include/block/raw-aio.h
M meson.build
M qemu-img.c
R stubs/io_uring.c
M stubs/meson.build
M tests/qemu-iotests/024
M tests/qemu-iotests/024.out
M tests/qemu-iotests/184
M tests/qemu-iotests/257
M tests/qemu-iotests/257.out
M tests/qemu-iotests/check
M tests/qemu-iotests/meson.build
M tests/qemu-iotests/testrunner.py
M tests/qemu-iotests/tests/resize-below-raw
M tests/qemu-iotests/tests/resize-below-raw.out
M tests/unit/test-aio.c
M tests/unit/test-nested-aio-poll.c
M util/aio-posix.c
M util/aio-posix.h
M util/aio-win32.c
M util/async.c
M util/fdmon-epoll.c
M util/fdmon-io_uring.c
M util/fdmon-poll.c
M util/trace-events
Log Message:
-----------
Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging
Block layer patches
- stream: Fix potential crash during job completion
- aio: add the aio_add_sqe() io_uring API
- qcow2: put discards in discard queue when discard-no-unref is enabled
- qcow2, vmdk: Restrict creation with secondary file using protocol
- qemu-img rebase: Fix assertion failure due to exceeding IO_BUF_SIZE
- iotests: Run iotests with sanitizers
- iotests: Add more image formats to the thorough testing
- iotests: Improve the dry run list to speed up thorough testing
- Code cleanup
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCgAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmkTqWcRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9awPg//VqEgqYbEr3dVUvBFk8tlcewoo7KGICVk
# 4kddOwMJIdcsVpiLuNzqQARH2kHV93Hiv+mVt25o00PkJx565eCGTh/bBFas3UXL
# JMBjgHyJutGr4cijkNrnQgqWfeTgc32xdVEWh1nZM2K7LslzC9I1PfUzfxRMYqZA
# Em0KE3vwQDC7xtIyk4t451hkfcQY8fwN9bDMpD+zbzaLsYTEyOJ900En88iW7oHE
# TuJhrviin11jdQCA26QVNXRaw7iIVVo8vJP1VEgbn31iY+Qpcr/HcQRs0x2gex67
# OqIdh4onqkdGCFDxTGUoAH+jORXWUmk/JipIhl9pJP0ZDyAjsm97ThJ6SvctURsK
# UMU0dzXEc1C5spD2CWnN0PujqHYQqYaylx7MdiCJMjaCfDB3ZeIRsTGoiLMB24P+
# WBrcn2P+f03nC/sVvxRZWrpyI2kZwEh1RsO/mnLQ3apVBFeKqaFi8Ouo9oi1ZMd6
# ahUw7sZSoTxmGY1FhOSRCGEh2Wjy0ZIOx9tHT1U9vig5Kf9KeE81yO8yaq2T60mq
# 9eaUL8rcUrKRiJw9NUkcEYmIUJrh0nUe/kK2RWmbEGMYIH7ASrGqiyUP5FxpekD+
# i/uen4BeyRwe6rnPOzGolg+HMysMBr8VD/8PwJ8g88FLH1jIdTYvFUdRbrkciUlo
# okC+y4+kqiU=
# =SI8s
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 11 Nov 2025 10:23:51 PM CET
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "[email protected]"
# gpg: Good signature from "Kevin Wolf <[email protected]>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (28 commits)
qemu-img rebase: don't exceed IO_BUF_SIZE in one operation
qcow2, vmdk: Restrict creation with secondary file using protocol
block: Allow drivers to control protocol prefix at creation
tests/qemu-iotest: Add more image formats to the thorough testing
tests/qemu-iotests: Improve the dry run list to speed up thorough testing
tests/qemu-iotests/184: Fix skip message for qemu-img without throttle
qcow2: put discards in discard queue when discard-no-unref is enabled
qcow2: rename update_refcount_discard to queue_discard
iotests: Run iotests with sanitizers
qemu-img: Fix amend option parse error handling
iotests: Test resizing file node under raw with size/offset
block: Drop detach_subchain for bdrv_replace_node
block: replace TABs with space
block/io_uring: use non-vectored read/write when possible
block/io_uring: use aio_add_sqe()
aio-posix: add aio_add_sqe() API for user-defined io_uring requests
aio-posix: add fdmon_ops->dispatch()
aio-posix: unindent fdmon_io_uring_destroy()
aio-posix: gracefully handle io_uring_queue_init() failure
aio: add errp argument to aio_context_setup()
...
Signed-off-by: Richard Henderson <[email protected]>
Compare: https://github.com/qemu/qemu/compare/4481234e985a...9febfa94b69b
To unsubscribe from these emails, change your notification settings at
https://github.com/qemu/qemu/settings/notifications