On Thu, Dec 14, 2023 at 12:10:32AM +0100, Paolo Bonzini wrote: > On Wed, Dec 13, 2023 at 10:15 PM Stefan Hajnoczi <stefa...@redhat.com> wrote: > > Alternatives welcome! (A cleaner version of this approach might be to forbid > > cross-thread aio_set_fd_handler() calls and to refactor all > > aio_set_fd_handler() callers so they come from the AioContext's home thread. > > I'm starting to think that only the aio_notify() and aio_schedule_bh() APIs > > should be thread-safe.) > > I think that's pretty hard because aio_set_fd_handler() is a pretty > important part of the handoff from one AioContext to another and also > of drained_begin()/end(), and both of these things run in the main > thread. > > Regarding how to solve this issue, there is a lot of > "underdocumenting" of the locking policy in aio-posix.c, and indeed it > makes running aio_set_fd_handler() in the target AioContext tempting; > but it is also scary to rely on the iothread being able to react > quickly. I'm also worried that we're changing the logic just because > we don't understand the old one, but then we add technical debt. > > So, as a first step, I would take inspiration from the block layer > locking work, and add assertions to functions like poll_set_started() > or find_aio_handler(). Is the list_lock elevated (nonzero)? Or locked? > Are we in the iothread? And likewise, for each list, does insertion > happen from the iothread or with the list_lock taken (and possibly > elevated)? Does removal happen from the iothread or with list_lock > zero+taken? > > After this step, we should have a clearer idea of the possible states > of the node (based on the lists, the state is a subset of > {poll_started, deleted, ready}) and draw a nice graph of the > transitions. We should also understand if any calls to > QLIST_IS_INSERTED() have correctness issues. > > Good news, I don't think any memory barriers are needed here. One > thing that we already do correctly is that, once a node is deleted, we > try to skip work; see for example poll_set_started(). This also > provides a good place to do cleanup work for deleted nodes, including > calling poll_end(): aio_free_deleted_handlers(), because it runs with > list_lock zero and taken, just like the tail of > aio_remove_fd_handler(). It's the safest possible place to do cleanup > and to take a lock. Therefore we have: > > - a fast path in the iothread that runs without any concurrence with > stuff happening in the main thread > > - a slow path in the iothread that runs with list_lock zero and taken. > The slow path shares logic with the main thread, meaning that > aio_free_deleted_handlers() and aio_remove_fd_handler() should share > some functions called by both. > > If the code is organized this way, any wrong bits should jump out more > easily. For example, these two lines in aio_remove_fd_handler() are > clearly misplaced > > node->pfd.revents = 0; > node->poll_ready = false; > > because they run in the main thread but they touch iothread data! They > should be after qemu_lockcnt_count() is checked to be zero. > > Regarding the call to io_poll_ready(), I would hope that it is > unnecessary; in other words, that after drained_end() the virtqueue > notification would be raised. Yes, virtio_queue_set_notification is > edge triggered rather than level triggered, so it would be necessary > to add a check with virtio_queue_host_notifier_aio_poll() and > virtio_queue_host_notifier_aio_poll_ready() in > virtio_queue_aio_attach_host_notifier, but that does not seem too bad > because virtio is the only user of the io_poll_begin and io_poll_end > callbacks. It would have to be documented though.
I think Hanna had the same idea: document that ->io_poll_end() isn't called by aio_set_fd_handler() and shift the responsibility onto the caller to get back into a state where notifications are enabled before they add the fd with aio_set_fd_handler() again. In a little more detail, the caller needs to do the following before adding the fd back with aio_set_fd_handler() again: 1. Call ->io_poll_end(). 2. Poll one more time in case an event slipped in and write to the eventfd so the fd is immediately readable or call ->io_poll_ready(). I think this is more or less what you described above. I don't like pushing this responsibility onto the caller, but adding a synchronization point in aio_set_fd_handler() is problematic, so let's give it a try. I'll try that approach and send a v2. Stefan > > Paolo > > > Paolo > > > > > Stefan Hajnoczi (3): > > aio-posix: run aio_set_fd_handler() in target AioContext > > aio: use counter instead of ctx->list_lock > > aio-posix: call ->poll_end() when removing AioHandler > > > > include/block/aio.h | 22 ++--- > > util/aio-posix.c | 197 ++++++++++++++++++++++++++++++++------------ > > util/async.c | 2 - > > util/fdmon-epoll.c | 6 +- > > 4 files changed, 152 insertions(+), 75 deletions(-) > > > > -- > > 2.43.0 > > >
signature.asc
Description: PGP signature