On shutdown nbd driver may be in a connecting state. We should shutdown it as well, otherwise we may hang in nbd_teardown_connection, waiting for conneciton_co to finish in BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down.
How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang. Note, that this hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: allow drain during reconnect attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> Reviewed-by: Eric Blake <ebl...@redhat.com> --- block/nbd.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 6d19f3c660..dfe1408b2d 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -209,11 +209,15 @@ static void nbd_teardown_connection(BlockDriverState *bs) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - if (s->state == NBD_CLIENT_CONNECTED) { + if (s->ioc) { /* finish any pending coroutines */ - assert(s->ioc); qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } else if (s->sioc) { + /* abort negotiation */ + qio_channel_shutdown(QIO_CHANNEL(s->sioc), QIO_CHANNEL_SHUTDOWN_BOTH, + NULL); } + s->state = NBD_CLIENT_QUIT; if (s->connection_co) { if (s->connection_co_sleep_ns_state) { @@ -1459,6 +1463,9 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc, int ret; trace_nbd_client_handshake(s->export); + + s->sioc = sioc; + qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL); qio_channel_attach_aio_context(QIO_CHANNEL(sioc), aio_context); @@ -1473,6 +1480,7 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc, g_free(s->info.name); if (ret < 0) { object_unref(OBJECT(sioc)); + s->sioc = NULL; return ret; } if (s->x_dirty_bitmap && !s->info.base_allocation) { @@ -1498,8 +1506,6 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc, } } - s->sioc = sioc; - if (!s->ioc) { s->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(s->ioc)); @@ -1520,6 +1526,7 @@ static int nbd_client_handshake(BlockDriverState *bs, QIOChannelSocket *sioc, nbd_send_request(s->ioc ?: QIO_CHANNEL(sioc), &request); object_unref(OBJECT(sioc)); + s->sioc = NULL; return ret; } -- 2.21.0