Hi all! There are problems with nbd driver:
- nbd reconnect is cancelled on drain, which is bad as Roman describes in his "[PATCH 0/7] block/nbd: decouple reconnect from drain" - nbd driver is too complicated around drained sections and aio context switch. It's nearly impossible to follow all the logic, including abuse of bs->in_flight, which is temporary decreased in some places (like nbd_read_eof()). Additional reconnect thread and two different state machines (we have BDRVNBDState::state and BDRVNBDState::connect_thread->state) doesn't make things simpler :) So, I have a plan: 1. Move nbd negotiation to connect_thread 2. Do receive NBD replies in request coroutines, not in connection_co At this point we can drop connection_co, and when we don't have endless running coroutine, NBD driver becomes a usual block driver, and we can drop abuse of bs->in_flight, and probably drop most of complicated logic around drained section and aio context switch in nbd driver. 3. Still, as Roman describes, with [2] we loose a possibility to reconnect immediately when connection breaks (with current code we have endless read in reconnect_co, but actually for this to work keep-alive should be setup correctly). So, we'll need to reinvent it, checking connection periodically by timeout, with help of getsockopt or just sending a kind of PING request (zero-length READ or something like this). And this series a kind of preparation. The main point of it is moving connect-thread to a separate file. This series may crash on iotest 277. So, it's based on corresponding fix: "[PATCH 1/7] block/nbd: avoid touching freed connect_thread": Based-on: <20210315060611.2989049-2-rvka...@yandex-team.ru> Vladimir Sementsov-Ogievskiy (14): block/nbd: BDRVNBDState: drop unused connect_err block/nbd: nbd_co_establish_connection(): drop unused errp block/nbd: drop unused NBDConnectThread::err field block/nbd: split connect_thread_cb() out of connect_thread_func() block/nbd: rename NBDConnectThread to NBDConnectCB block/nbd: further segregation of connect-thread block/nbd: drop nbd_free_connect_thread() block/nbd: move nbd connect-thread to nbd/client-connect.c block/nbd: NBDConnectCB: drop bh_* fields block/nbd: move wait_connect field under mutex protection block/nbd: refactor connect_bh() block/nbd: refactor nbd_co_establish_connection block/nbd: nbd_co_establish_connection_cancel(): rename wake to do_wake block/nbd: drop thr->state include/block/nbd.h | 6 + block/nbd.c | 266 ++++++++++++++----------------------------- nbd/client-connect.c | 72 ++++++++++++ nbd/meson.build | 1 + 4 files changed, 162 insertions(+), 183 deletions(-) create mode 100644 nbd/client-connect.c -- 2.29.2