Am 03.05.2023 um 15:11 hat Stefan Hajnoczi geschrieben:
> On Wed, May 03, 2023 at 10:08:46AM +0200, Kevin Wolf wrote:
> > Am 02.05.2023 um 22:06 hat Stefan Hajnoczi geschrieben:
> > > On Tue, May 02, 2023 at 06:04:24PM +0200, Kevin Wolf wrote:
> > > > Am 25.04.2023 um 19:27 hat Stefan Hajnoczi geschrieben:
> > > > > vhost-user activity must be suspended during bdrv_drained_begin/end().
> > > > > This prevents new requests from interfering with whatever is happening
> > > > > in the drained section.
> > > > > 
> > > > > Previously this was done using aio_set_fd_handler()'s is_external
> > > > > argument. In a multi-queue block layer world the 
> > > > > aio_disable_external()
> > > > > API cannot be used since multiple AioContext may be processing I/O, 
> > > > > not
> > > > > just one.
> > > > > 
> > > > > Switch to BlockDevOps->drained_begin/end() callbacks.
> > > > > 
> > > > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
> > > > > ---
> > > > >  block/export/vhost-user-blk-server.c | 43 
> > > > > ++++++++++++++--------------
> > > > >  util/vhost-user-server.c             | 10 +++----
> > > > >  2 files changed, 26 insertions(+), 27 deletions(-)
> > > > > 
> > > > > diff --git a/block/export/vhost-user-blk-server.c 
> > > > > b/block/export/vhost-user-blk-server.c
> > > > > index 092b86aae4..d20f69cd74 100644
> > > > > --- a/block/export/vhost-user-blk-server.c
> > > > > +++ b/block/export/vhost-user-blk-server.c
> > > > > @@ -208,22 +208,6 @@ static const VuDevIface vu_blk_iface = {
> > > > >      .process_msg           = vu_blk_process_msg,
> > > > >  };
> > > > >  
> > > > > -static void blk_aio_attached(AioContext *ctx, void *opaque)
> > > > > -{
> > > > > -    VuBlkExport *vexp = opaque;
> > > > > -
> > > > > -    vexp->export.ctx = ctx;
> > > > > -    vhost_user_server_attach_aio_context(&vexp->vu_server, ctx);
> > > > > -}
> > > > > -
> > > > > -static void blk_aio_detach(void *opaque)
> > > > > -{
> > > > > -    VuBlkExport *vexp = opaque;
> > > > > -
> > > > > -    vhost_user_server_detach_aio_context(&vexp->vu_server);
> > > > > -    vexp->export.ctx = NULL;
> > > > > -}
> > > > 
> > > > So for changing the AioContext, we now rely on the fact that the node to
> > > > be changed is always drained, so the drain callbacks implicitly cover
> > > > this case, too?
> > > 
> > > Yes.
> > 
> > Ok. This surprised me a bit at first, but I think it's fine.
> > 
> > We just need to remember it if we ever decide that once we have
> > multiqueue, we can actually change the default AioContext without
> > draining the node. But maybe at that point, we have to do more
> > fundamental changes anyway.
> > 
> > > > >  static void
> > > > >  vu_blk_initialize_config(BlockDriverState *bs,
> > > > >                           struct virtio_blk_config *config,
> > > > > @@ -272,6 +256,25 @@ static void vu_blk_exp_resize(void *opaque)
> > > > >      vu_config_change_msg(&vexp->vu_server.vu_dev);
> > > > >  }
> > > > >  
> > > > > +/* Called with vexp->export.ctx acquired */
> > > > > +static void vu_blk_drained_begin(void *opaque)
> > > > > +{
> > > > > +    VuBlkExport *vexp = opaque;
> > > > > +
> > > > > +    vhost_user_server_detach_aio_context(&vexp->vu_server);
> > > > > +}
> > > > 
> > > > Compared to the old code, we're losing the vexp->export.ctx = NULL. This
> > > > is correct at this point because after drained_begin we still keep
> > > > processing requests until we arrive at a quiescent state.
> > > > 
> > > > However, if we detach the AioContext because we're deleting the
> > > > iothread, won't we end up with a dangling pointer in vexp->export.ctx?
> > > > Or can we be certain that nothing interesting happens before drained_end
> > > > updates it with a new valid pointer again?
> > > 
> > > If you want I can add the detach() callback back again and set ctx to
> > > NULL there?
> > 
> > I haven't thought enough about it to say if it's a problem. If you have
> > and are confident that it's correct the way it is, I'm happy with it.
> >
> > But bringing the callback back is the minimal change compared to the old
> > state. It's just unnecessary code if we don't actually need it.
> 
> The reasoning behind my patch is that detach() sets NULL today and we
> would see crashes if ctx was accessed between detach() -> attach().
> Therefore, I'm assuming there are no ctx accesses in the code today and
> removing the ctx = NULL assignment doesn't break anything.

Sometimes ctx = NULL defaults to qemu_get_aio_context(), so in theory
there could be cases where NULL works, but a dangling pointer wouldn't.

> However, my approach is not very defensive. If the code is changed in
> a way that accesses ctx when it's not supposed to, then a dangling
> pointer will be accessed.
> 
> I think leaving the detach() callback there can be justified because it
> will make it easier to detect bugs in the future. I'll add it back in
> the next revision.

Ok, sounds good me.

Kevin

Attachment: signature.asc
Description: PGP signature

Reply via email to