On Tue, Apr 12, 2022 at 09:42:02PM +0200, Paolo Bonzini wrote: > Remove the confusing, and most likely wrong, atomics. The only function > that used to be somewhat in a hot path was nbd_client_connected(), > but it is not anymore after the previous patches. > > The function nbd_client_connecting_wait() was used mostly to check if > a request had to be reissued (outside requests_lock), but also > under requests_lock in nbd_client_connecting_wait(). The two uses have to
"Function A was used mostly..., but also under requests_lock in function A." Reading the rest of the patch, I think...[1] > be separated; for the former we rename it to nbd_client_will_reconnect() > and make it take s->requests_lock; for the latter the access can simply > be inlined. The new name is clearer, and ensures that a missing > conversion is caught by the compiler. I take it your experiments with C++ coroutines helped find this ;) > > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > --- > block/nbd.c | 88 +++++++++++++++++++++++++++++------------------------ > 1 file changed, 48 insertions(+), 40 deletions(-) > @@ -226,7 +225,9 @@ static void nbd_teardown_connection(BlockDriverState *bs) > s->ioc = NULL; > } > > - s->state = NBD_CLIENT_QUIT; > + WITH_QEMU_LOCK_GUARD(&s->requests_lock) { > + s->state = NBD_CLIENT_QUIT; > + } > } This style for protecting s->state at the end of the function takes 3 lines thanks to WITH_QEMU_LOCK_GUARD...[2] > > static void open_timer_del(BDRVNBDState *s) > @@ -255,16 +256,13 @@ static void open_timer_init(BDRVNBDState *s, uint64_t > expire_time_ns) > timer_mod(s->open_timer, expire_time_ns); > } > > -static bool nbd_client_connecting(BDRVNBDState *s) > +static bool nbd_client_will_reconnect(BDRVNBDState *s) This part of the diff got hard to read, since you mixed shuffling functions with a rename. On a closer read, I see that nbd_client_connecting() was merely moved later[3], while the new name nbd_client_will_reconnect()...[4] > { > - NBDClientState state = qatomic_load_acquire(&s->state); > - return state == NBD_CLIENT_CONNECTING_WAIT || > - state == NBD_CLIENT_CONNECTING_NOWAIT; > -} > - > -static bool nbd_client_connecting_wait(BDRVNBDState *s) [4]...is indeed happening to nbd_client_connecting_wait(), as promised in the commit message. Which means: [1]...so it looks like the first 'function A' did indeed want to be nbd_client_connecting_wait() which got renamed (since nbd_client_connecting() was moved later in the file), while...[1] > -{ > - return qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT; > + /* > + * Called only after a socket error, so this is not performance > sensitive. > + */ > + QEMU_LOCK_GUARD(&s->requests_lock); > + return s->state == NBD_CLIENT_CONNECTING_WAIT; > } [2]...while here, you only needed two lines, using QEMU_LOCK_GUARD. Both styles work, but it seems like we should be consistent, and I would favor the shorter style when all that is being guarded is a single line. > > /* > @@ -351,15 +349,24 @@ int coroutine_fn > nbd_co_do_establish_connection(BlockDriverState *bs, > qio_channel_attach_aio_context(s->ioc, bdrv_get_aio_context(bs)); > > /* successfully connected */ > - s->state = NBD_CLIENT_CONNECTED; > + WITH_QEMU_LOCK_GUARD(&s->requests_lock) { > + s->state = NBD_CLIENT_CONNECTED; > + } > > return 0; > } Another place where the shorter QEMU_LOCK_GUARD() would work. > > +/* Called with s->requests_lock held. */ > +static bool nbd_client_connecting(BDRVNBDState *s) [3]...here's where the moved function threw me off. Perhaps splitting out a preliminary patch to just move the function before converting it to be under s->requests_lock, so that the rename of a different function doesn't cause a hard-to-grok diff, would be wise. > +{ > + return s->state == NBD_CLIENT_CONNECTING_WAIT || > + s->state == NBD_CLIENT_CONNECTING_NOWAIT; > +} > + > /* Called with s->requests_lock taken. */ > static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) > { > - bool blocking = nbd_client_connecting_wait(s); > + bool blocking = s->state == NBD_CLIENT_CONNECTING_WAIT; [1]...and the second instance of 'function A' in the commit message should have been nbd_reconnect_attempt(). As messy as the diff was, I still think I understood it. With the necessary correction to the commit message according to [1], I could be comfortable with Reviewed-by: Eric Blake <ebl...@redhat.com> although the suggestion in [3] to split out the function motion to a separate patch may result in the v2 series looking different enough that you may want to leave off my R-b to ensure I still review things carefully. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org