09.06.2018 18:32, Vladimir Sementsov-Ogievskiy wrote:
Implement reconnect. To achieve this:

1. Move from quit bool variable to state. 4 states are introduced:
    connecting-wait: means, that reconnecting is in progress, and there
      were small number of reconnect attempts, so all requests are
      waiting for the connection.
    connecting-nowait: reconnecting is in progress, there were a lot of
      attempts of reconnect, all requests will return errors.
    connected: normal state
    quit: exiting after fatal error or on close

Possible transitions are:

    * -> quit
    connecting-* -> connected
    connecting-wait -> connecting-nowait
    connected -> connecting-wait

2. Implement reconnect in connection_co. So, in connecting-* mode,
     connection_co, tries to reconnect every NBD_RECONNECT_NS.
     Configuring of this parameter (as well as NBD_RECONNECT_ATTEMPTS,
     which specifies bound of transition from connecting-wait to
     connecting-nowait) may be done as a follow-up patch.

3. Retry nbd queries on channel error, if we are in connecting-wait
     state.

4. In init, wait until for connection until transition to
     connecting-nowait. So, NBD_RECONNECT_ATTEMPTS is a bound of fail
     for initial connection too.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>

squash:

@@ -616,7 +617,10 @@ static coroutine_fn int nbd_co_receive_one_chunk(
         s->reply.handle = 0;
     }

-    if (s->connection_co) {
+    if (s->connection_co && !s->wait_in_flight) {
+        /* We must check s->wait_in_flight, because we may entered by
+         * nbd_recv_coroutines_wake_all(), int this case we should not
+         * wake connection_co here, it will woken by last request. */
         aio_co_wake(s->connection_co);
     }



--
Best regards,
Vladimir


Reply via email to