On Mon, 1 Apr 2024 11:33:09AM -0500, Eric Blake wrote: > On Mon, Apr 01, 2024 at 08:41:20PM +0800, Zhu Yangyang wrote: > > Coroutines are not supposed to block. Instead, they should yield. > > > > Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") > > Signed-off-by: Zhu Yangyang <zhuyangyan...@huawei.com> > > --- > > nbd/client.c | 7 ++++--- > > nbd/common.c | 19 ++++++++++++++++--- > > nbd/nbd-internal.h | 6 +++--- > > nbd/server.c | 10 +++++----- > > 4 files changed, 28 insertions(+), 14 deletions(-) > > > > diff --git a/nbd/client.c b/nbd/client.c > > index 29ffc609a4..1ab91ed205 100644 > > --- a/nbd/client.c > > +++ b/nbd/client.c > > @@ -619,18 +619,19 @@ static QIOChannel *nbd_receive_starttls(QIOChannel > > *ioc, > > return NULL; > > } > > qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); > > - data.loop = g_main_loop_new(g_main_context_default(), FALSE); > > trace_nbd_receive_starttls_tls_handshake(); > > qio_channel_tls_handshake(tioc, > > - nbd_tls_handshake, > > + nbd_client_tls_handshake, > > &data, > > NULL, > > NULL); > > > > if (!data.complete) { > > + data.loop = g_main_loop_new(g_main_context_default(), FALSE); > > g_main_loop_run(data.loop); > > + g_main_loop_unref(data.loop); > > } > > - g_main_loop_unref(data.loop); > > + > > Aha - you figured out an elegant way around the client code not being > in coroutine context that had been stumping me, at least for the sake > of a minimal patch. > > > if (data.error) { > > error_propagate(errp, data.error); > > object_unref(OBJECT(tioc)); > > diff --git a/nbd/common.c b/nbd/common.c > > index 3247c1d618..01ca30a5c4 100644 > > --- a/nbd/common.c > > +++ b/nbd/common.c > > @@ -47,14 +47,27 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp) > > } > > > > > > -void nbd_tls_handshake(QIOTask *task, > > - void *opaque) > > +void nbd_client_tls_handshake(QIOTask *task, void *opaque) > > { > > struct NBDTLSHandshakeData *data = opaque; > > > > qio_task_propagate_error(task, &data->error); > > data->complete = true; > > - g_main_loop_quit(data->loop); > > + if (data->loop) { > > + g_main_loop_quit(data->loop); > > + } > > +} > > + > > + > > +void nbd_server_tls_handshake(QIOTask *task, void *opaque) > > +{ > > + struct NBDTLSHandshakeData *data = opaque; > > + > > + qio_task_propagate_error(task, &data->error); > > + data->complete = true; > > + if (!qemu_coroutine_entered(data->co)) { > > + aio_co_wake(data->co); > > + } > > } > > This matches up with what I was experimenting with last week, in that > server is in coroutine context while client is not. However, it means > we no longer have a common implementation between the two, so keeping > the code isolated in nbd/common.c makes less sense than just placing > the two specific callbacks in the two specific files right where their > only use case exists (and even making them static at that point).
Yes, we can implement nbd_tls_handshake() on both client and server side. It looks a lot clearer. We can even extract the common code to an nbd_tls_handshake_complete(). nbd/common.c void nbd_tls_handshake_complete(QIOTask *task, void *opaque) { struct NBDTLSHandshakeData *data = opaque; qio_task_propagate_error(task, &data->error); data->complete = true; } server.c / client.c static void nbd_tls_handshake(QIOTask *task, void *opaque) { struct NBDTLSHandshakeData *data = opaque; nbd_tls_handshake_complete(task, opaque); ... } > > Or, can we still merge it into one helper? It looks like we now have > 3 viable possibilities: > > data->loop data->co > non-NULL non-NULL impossible > NULL NULL client, qio_task completed right away > non-NULL NULL client, qio_task did not complete right away > NULL non-NULL server, waking the coroutine depends on if we are in > one This seems a little complicated. > > With that, we can still get by with one function, but need good > documentation. I'll post a v3 along those lines, to see what you > think. > > > > > > > diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h > > index dfa02f77ee..99cca9382c 100644 > > --- a/nbd/nbd-internal.h > > +++ b/nbd/nbd-internal.h > > @@ -74,13 +74,13 @@ static inline int nbd_write(QIOChannel *ioc, const void > > *buffer, size_t size, > > > > struct NBDTLSHandshakeData { > > GMainLoop *loop; > > + Coroutine *co; > > bool complete; > > Error *error; > > }; > > I had tried to get rid of the GMainLoop *loop member altogether, but > your change has the benefit of a smaller diff than what I was facing > (I got lost in the weeds trying to see if I could convert all of the > client into running in coroutine context). I saw your reply and also tried to put the client in the coroutine context, And then I found that the event listener is registered to the default main_context, This means that we can't use aio_poll(ctx) and AIO_WAIT_WHILE() to wait for the coroutine to complete. GMainLoop *loop may not be circumvented. g_source_attach(source, NULL) qio_channel_add_watch_full() qio_channel_tls_handshake_task() qio_channel_tls_handshake() nbd_receive_starttls() nbd_start_negotiate() > > > > - > > -void nbd_tls_handshake(QIOTask *task, > > - void *opaque); > > +void nbd_server_tls_handshake(QIOTask *task, void *opaque); > > +void nbd_client_tls_handshake(QIOTask *task, void *opaque); > > > > int nbd_drop(QIOChannel *ioc, size_t size, Error **errp); > > > > diff --git a/nbd/server.c b/nbd/server.c > > index c3484cc1eb..b218512ced 100644 > > --- a/nbd/server.c > > +++ b/nbd/server.c > > @@ -777,17 +777,17 @@ static QIOChannel > > *nbd_negotiate_handle_starttls(NBDClient *client, > > > > qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-server-tls"); > > trace_nbd_negotiate_handle_starttls_handshake(); > > - data.loop = g_main_loop_new(g_main_context_default(), FALSE); > > + data.co = qemu_coroutine_self(); > > qio_channel_tls_handshake(tioc, > > - nbd_tls_handshake, > > + nbd_server_tls_handshake, > > &data, > > NULL, > > NULL); > > > > - if (!data.complete) { > > - g_main_loop_run(data.loop); > > + while (!data.complete) { > > + qemu_coroutine_yield(); > > } > > - g_main_loop_unref(data.loop); > > + > > if (data.error) { > > object_unref(OBJECT(tioc)); > > error_propagate(errp, data.error); > > -- > > 2.33.0 > > > > Thanks for the updated patch - it looks like we are heading in a good > direction. > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. > Virtualization: qemu.org | libguestfs.o -- Best Regards, Zhu Yangyang