On 04.03.2012 20:08, Avi Kivity wrote:
> On 03/04/2012 02:41 PM, Michael Tokarev wrote:
>> Since all block (bdrv) layer is now implemented using
>> coroutines, I thought I'd give it a try.  But immediately
>> hit a question to which I don't know a good answer.
>>
>> Suppose we've some networking block device (like NBD) and
>> want to be able to support reconnection - this is actually
>> very useful feature, in order to be able to reboot/restart
>> the NBD server without a need to restart all the clients.
>>
>> For this to work, we should have an ability to reconnect
>> to the server and re-issue all requests which were waiting
>> for reply.
>>
>> Traditionally, in asyncronous event-loop-based scheme, this
>> is implemented as a queue of requests linked to the block
>> driver state structure, and in case of reconnection we just
>> walk over all requests and requeue these.
>>
>> But if the block driver is implemented as a set of coroutines
>> (like nbd currently does), I see no sane/safe way to restart
>> the requests.  Setjmp/longjmp can be uses with extra care
>> there, but with these it is extremly fragile.
>>
>> Any hints on how to do that?
> 
> From the block layer's point of view, the requests should still be
> pending.  For example, if a read request sees a dropped connection, it
> adds itself to a list of coroutines waiting for a reconnect, wakes up a
> connection manager coroutine (or thread), and sleeps.  The connection
> manager periodically tries to connect, and if it succeeds, it wakes up
> the coroutines waiting for a reconnection.

This is all fine, except of one thing: restarting (resending) of
the requests which has been sent to the remote and for which we
were waiting for reply already.

For these requests, they should be resent using new socket, when
the connection manager wakes corresponding coroutine up.  That's
where my question comes.

The request handling coroutine looks like regular function
(pseudocode):

 offset = 0;
 while(offset < sendsize) {
   ret = send(sock, senddata+offset, sendsize-offset);
   if (EAGAIN) { coroutine_yeld(); continue; }
   if (ret < 0) return -EIO;
   offset += ret;
 }
 offset = 0;
 while(offset < recvsize) {
   ret = recv(sock, recvdata+offset, recvsize-offset);
   if (EAGAIN) { coroutine_yeld(); continue; }
   if (ret < 0) return -EIO;
   offset += ret;
 }
 return status(recvdata);

This function will need to have a ton of "goto begin" in all places
where it calls yeld() -- in order to actually start _sending_ the
request to the new sock after a reconnect.  It is all good when it
is in one function, but if the code is split into several functions,
it becomes very clumsy, to a point where regular asyncronous state
machine bay be more appropriate.  It also requires to open-code all
helper functions (like qiov handlers).

So I was asking if it can be done using a setjmp/longjmp maybe, to
simplify it all somehow.

> It's important to implement request cancellation correctly here, or we
> can end up with a device that cannot be unplugged or a guest that cannot
> be shutdown.

Is there some mechanism to cancel bdrv_co_{read,write}v()?  I see a
way to cancel bdrv_aio_{read,write}v(), but even these are now
implemented as coroutines...

Thanks!

/mjt


Reply via email to