Re: [RFC 04/11] aio-posix: integrate fdmon into glib event loop

2025-05-28 Thread Eric Blake
On Wed, May 28, 2025 at 03:09:09PM -0400, Stefan Hajnoczi wrote:
> AioContext's glib integration only supports ppoll(2) file descriptor
> monitoring. epoll(7) and io_uring(7) disable themselves and switch back
> to ppoll(2) when the glib event loop is used. The main loop thread
> cannot use epoll(7) or io_uring(7) because it always uses the glib event
> loop.
> 
> Future QEMU features may require io_uring(7). One example is uring_cmd
> support in FUSE exports. Each feature could create its own io_uring(7)
> context and integrate it into the event loop, but this is inefficient
> due to extra syscalls. It would be more efficient to reuse the
> AioContext's existing fdmon-io_uring.c io_uring(7) context because
> fdmon-io_uring.c will already be active on systems where Linux io_uring
> is available.
> 
> In order to keep fdmon-io_uring.c's AioContext operational even when the
> glib event loop is used, extend FDMonOps with an API similar to
> GSourceFuncs so that file descriptor monitoring can integrate into the
> glib event loop.
> 
> A quick summary of the GSourceFuncs API:
> - prepare() is called each event loop iteration before waiting for file
>   descriptors and timers.
> - check() is called to determine whether events are ready to be
>   dispatched after waiting.
> - dispatch() is called to process events.
> 
> More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html
> 
> Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and
> also implement epoll(7)- and io_uring(7)-specific file descriptor
> monitoring code for glib event loops.
> 
> Note that it's still faster to use aio_poll() rather than the glib event
> loop since glib waits for file descriptor activity with ppoll(2) and
> does not support adaptive polling. But at least epoll(7) and io_uring(7)
> now work in glib event loops.

Nice.

> 
> Splitting this into multiple commits without temporarily breaking
> AioContext proved difficult so this commit makes all the changes. The
> next commit will remove the aio_context_use_g_source() API because it is
> no longer needed.

Resulting in a big patch, but I agree that there's no easier way to do
it more incrementally.

> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/block/aio.h   | 36 ++
>  util/aio-posix.h  |  5 +++
>  tests/unit/test-aio.c |  7 +++-
>  util/aio-posix.c  | 69 -
>  util/fdmon-epoll.c| 52 ++---
>  util/fdmon-io_uring.c | 44 +-
>  util/fdmon-poll.c | 88 ++-
>  7 files changed, 239 insertions(+), 62 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




[RFC 04/11] aio-posix: integrate fdmon into glib event loop

2025-05-28 Thread Stefan Hajnoczi
AioContext's glib integration only supports ppoll(2) file descriptor
monitoring. epoll(7) and io_uring(7) disable themselves and switch back
to ppoll(2) when the glib event loop is used. The main loop thread
cannot use epoll(7) or io_uring(7) because it always uses the glib event
loop.

Future QEMU features may require io_uring(7). One example is uring_cmd
support in FUSE exports. Each feature could create its own io_uring(7)
context and integrate it into the event loop, but this is inefficient
due to extra syscalls. It would be more efficient to reuse the
AioContext's existing fdmon-io_uring.c io_uring(7) context because
fdmon-io_uring.c will already be active on systems where Linux io_uring
is available.

In order to keep fdmon-io_uring.c's AioContext operational even when the
glib event loop is used, extend FDMonOps with an API similar to
GSourceFuncs so that file descriptor monitoring can integrate into the
glib event loop.

A quick summary of the GSourceFuncs API:
- prepare() is called each event loop iteration before waiting for file
  descriptors and timers.
- check() is called to determine whether events are ready to be
  dispatched after waiting.
- dispatch() is called to process events.

More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html

Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and
also implement epoll(7)- and io_uring(7)-specific file descriptor
monitoring code for glib event loops.

Note that it's still faster to use aio_poll() rather than the glib event
loop since glib waits for file descriptor activity with ppoll(2) and
does not support adaptive polling. But at least epoll(7) and io_uring(7)
now work in glib event loops.

Splitting this into multiple commits without temporarily breaking
AioContext proved difficult so this commit makes all the changes. The
next commit will remove the aio_context_use_g_source() API because it is
no longer needed.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/aio.h   | 36 ++
 util/aio-posix.h  |  5 +++
 tests/unit/test-aio.c |  7 +++-
 util/aio-posix.c  | 69 -
 util/fdmon-epoll.c| 52 ++---
 util/fdmon-io_uring.c | 44 +-
 util/fdmon-poll.c | 88 ++-
 7 files changed, 239 insertions(+), 62 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 99ff48420b..39ed86d14d 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -106,6 +106,38 @@ typedef struct {
  * Returns: true if ->wait() should be called, false otherwise.
  */
 bool (*need_wait)(AioContext *ctx);
+
+/*
+ * gsource_prepare:
+ * @ctx: the AioContext
+ *
+ * Prepare for the glib event loop to wait for events instead of the usual
+ * ->wait() call. See glib's GSourceFuncs->prepare().
+ */
+void (*gsource_prepare)(AioContext *ctx);
+
+/*
+ * gsource_check:
+ * @ctx: the AioContext
+ *
+ * Called by the glib event loop from glib's GSourceFuncs->check() after
+ * waiting for events.
+ *
+ * Returns: true when ready to be dispatched.
+ */
+bool (*gsource_check)(AioContext *ctx);
+
+/*
+ * gsource_dispatch:
+ * @ctx: the AioContext
+ * @ready_list: list for handlers that become ready
+ *
+ * Place ready AioHandlers on ready_list. Called as part of the glib event
+ * loop from glib's GSourceFuncs->dispatch().
+ *
+ * Called with list_lock incremented.
+ */
+void (*gsource_dispatch)(AioContext *ctx, AioHandlerList *ready_list);
 } FDMonOps;
 
 /*
@@ -222,6 +254,7 @@ struct AioContext {
 /* State for file descriptor monitoring using Linux io_uring */
 struct io_uring fdmon_io_uring;
 AioHandlerSList submit_list;
+gpointer io_uring_fd_tag;
 #endif
 
 /* TimerLists for calling timers - one per clock type.  Has its own
@@ -254,6 +287,9 @@ struct AioContext {
 /* epoll(7) state used when built with CONFIG_EPOLL */
 int epollfd;
 
+/* The GSource unix fd tag for epollfd */
+gpointer epollfd_tag;
+
 const FDMonOps *fdmon_ops;
 };
 
diff --git a/util/aio-posix.h b/util/aio-posix.h
index 82a0201ea4..f9994ed79e 100644
--- a/util/aio-posix.h
+++ b/util/aio-posix.h
@@ -47,9 +47,14 @@ void aio_add_ready_handler(AioHandlerList *ready_list, 
AioHandler *node,
 
 extern const FDMonOps fdmon_poll_ops;
 
+/* Switch back to poll(2). list_lock must be held. */
+void fdmon_poll_downgrade(AioContext *ctx);
+
 #ifdef CONFIG_EPOLL_CREATE1
 bool fdmon_epoll_try_upgrade(AioContext *ctx, unsigned npfd);
 void fdmon_epoll_setup(AioContext *ctx);
+
+/* list_lock must be held */
 void fdmon_epoll_disable(AioContext *ctx);
 #else
 static inline bool fdmon_epoll_try_upgrade(AioContext *ctx, unsigned npfd)
diff --git a/tests/unit/test-aio.c b/tests/unit/test-aio.c
index e77d86be87..010d65b79a 100644
--- a/tests/unit/test-aio.c
+++ b/tests/unit/test-aio.c
@@ -527,