On Thu, May 19, 2016 at 08:35:03AM +0200, Markus Pargmann wrote:
> Hi Wouter,
> 
> On Sun, May 15, 2016 at 02:55:39PM +0200, Wouter Verhelst wrote:
> > Hi Markus,
> > 
> > On Thu, May 12, 2016 at 11:53:01AM +0200, Markus Pargmann wrote:
> > > On Thursday 28 April 2016 18:27:34 Wouter Verhelst wrote:
> > > > However, at some point I agreed with Paul (your predecessor) that when
> > > > this happens due to an error condition (as opposed to it being due to an
> > > > explicit disconnect), the kernel would block all reads from or writes to
> > > > the device, and the client may try to reconnect *from the same
> > > > PID* (i.e., it may not fork()). If that succeeds, the next NBD_DO_IT is
> > > > assumed to be connected to the same server; if instead the process
> > > > exits, then the block device is assumed to be dead, will be reset, and
> > > > all pending reads or writes would error.
> > > > 
> > > > In principle, this allows for a proper reconnect from userspace if it
> > > > can be done. However, I'm not sure whether this ever worked well or
> > > > whether it was documented, so it's probably fine if you think it should
> > > > be replaced with something else.
> > > 
> > > At least I was not aware of this possibility. As far as I know the
> > > previous code even had issues with the signals used to kill on timeouts
> > > which also killed the userspace program sometimes.
> > > 
> > > Currently I can't see a code path that supports reconnects. But I may
> > > have removed that accidently in the past.
> > 
> > Right. Like I said, I'm not sure if it ever worked well. The user space
> > client has a -persist option that tries to implement it, but I've been
> > getting some bug reports from people who've tried it (although that may
> > have been my fault rather than the kernel's).
> > 
> > > > (obviously, userspace reconnecting the device to a different device is
> > > > wrong and should not be done, but that's a case of "if you break it, you
> > > > get to keep both pieces)
> > > > 
> > > > At any rate, I think it makes sense for userspace to be given a chance
> > > > to *attempt* to reconnect a device when the connection drops
> > > > unexpectedly.
> > > 
> > > Perhaps it would be better to setup the kernel driver explicitly for
> > > that. Perhaps some flag to let the kernel driver know that the client
> > > would like to keep the block device open? In that case the client could
> > > excplicitly use NBD_CLEAR_SOCK to cleanup everything.
> > 
> > I'm not sure what you mean by this. Can you clarify?
> 
> I meant that it might be better to have a separate way for NBD_DO_IT.
> Something where the client software can directly instruct the kernel to
> keep everything opened in case of an error so that the client may
> reconnect afterwards.
>
> This could be a new ioctl that sets it up, for example 'NBD_PERSISTENT'.

If we're going to add a new ioctl, I think it might be useful to have a
second "flags" type ioctl; one where the client can set options, and
where the return value of the ioctl indicates which options the kernel
understands and will/can honour.

This could then also be used for things like checking whether the kernel
supports structured writes, etc.

> The NBD_DO_IT afterwards would keep everything up and running in case of
> a connection error so that the client could set a new socket using
> NBD_SET_SOCK and reenter using NBD_DO_IT.
> 
> For all clients that are not capable of this mechanism or don't use it,
> NBD_DO_IT would clean up everything properly on any error.

Right.

> > > Or perhaps a completely new ioctl that can transmit back some more
> > > information about what failures were seen and whether the blockdevice
> > > was closed or not?
> > 
> > The intent was that ioctl(NBD_DO_IT) would return an error when the
> > disconnect was not requested, and would return 0 when the connection
> > dropped due to userspace doing ioctl(NBD_DISCONNECT), since dropping the
> > connection when userspace explicitly asks for it is not an error.
> > 
> > drivers/block/nbd.c contains the following:
> > 
> > static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
> >                        unsigned int cmd, unsigned long arg)
> > {
> > [...]
> >     case NBD_DO_IT: {
> > [...]
> >                 if (nbd->disconnect) /* user requested, ignore socket 
> > errors */
> >                         return 0;
> >                 return error;
> >     }
> > [...]
> > 
> > so the signalling part of it is at least still there. Whether it works,
> > I haven't tested.
> 
> I just looked up the kernel code from 4.0. This code was there as
> well. But the socket and blockdevice were both destroyed before leaving
> the NBD_DO_IT ioctl. So it seems to have never been really persistent.
> Filesystems would have still been killed.
> 
> So for a persistent nbd device there is some more code necessary to do
> it.

Okay.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Nbd-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nbd-general

Reply via email to