On Fri, Feb 08, 2013 at 09:38:11AM -0800, Zach Brown wrote: > > The draft implementation will look like this. struct bio should have > > some way to get current status of kiocb that generated bio. So we add > > a pointer to bool flag. > > > > struct bio { > > bool *cancelled; > > } > > > > in async DIO codepath this pointer will be initialized with bool at > > "struct kiocb" > > bio->cancelled = &kiocb->cancelled; > > Hmm. I suppose. It sure looks disgusting, but the iocb has forgotten > all of the dio state and bios that lead up to the operation once > submission has completed. It's easy enough to match the lifetime rules > of this reference with end_io's using the iocb.
That's part of it (w.r.t. the exact location of the cancelled flag), but the idea also isn't for this to be specific to kiocbs or anything like that. The problem, as you alluded to, is once a bio passes generic_make_request() the upper layer has no idea what the block layer is doing with it. If it's a stacking block device - or even if it's just getting bounced - the first thing that happens to the bio is it gets cloned. So, adding a mechanism for the upper layer to chase down those bios and whatever other state the lower layers created would be gross. Fuck that. And, you know my thoughts on callbacks; Ted pointed out some reasons we are probably going to need cancellation callbacks eventually but I personally _don't_ want this to get designed around callbacks. But if we just add this layer of indirection, when a bio is cloned we can copy the pointer too and cancellation magically Just Works. For md writes and probably some other things we won't be able to just blindly copy the pointer as that'd fuck up md's parity stuff, but md could still check the flag itself if it was worth the trouble. > > This method also lets one cancelled iocb flag many submitted bios as > cancelled, so that's nice. > > > So to cancel kiocb we do > > kiocb->cancelled = true; > > and all bio created from the request will not be send to device anymore. > > (With lots of comments to let us know that it being racey is fine.) Yeah. We should be really explicit about what this flag means; it _doesn't_ mean "this bio has been cancelled", it means "please try to cancel this bio". We don't know if the bio was succesfully cancelled until the bio is completed (and this doesn't change anything about how bio completion works) - if it was cancelled, the bi_end_io callback will get -ECANCELLED or something. This is very different from the existing AIO cancellation; KIF_CANCEL means "this kiocb _has been cancelled_". (sort of, I was just rereading that code and I'm convinced it's buggy). > > If the proposal is fine then I start implementing it. > > For a teeny hack like this the best proposal is a working prototype > patch. Don't wait for acks just dive in. Yeah, and I keep claiming I'm going to implement the AIO bits of this (I keep getting distracted by other things like taking a flamethrower to the dio code). Note that the semantics of this doesn't really match up with io_cancel(). That's not the end of the world, we can paper over that in the aio code without too much grossness (or at least it'll be localized grossness). IMO though, io_cancel() is dumb and we want something new. It's synchronous - and why anyone ever thought cancellation for _asynchronous_ io should be synchronous I don't know. Anyways, if io_cancel() is going to succeed it has to block until the io has been cancelled and it's got a completion - the completion isn't returned via io_getevents(). This makes _no sense_, and means an application that is making use of this probably is going to need a thread pool just to do cancellation. What we want is a new io_cancel_sane() syscall that doesn't return anything, and only marks the iocb as "cancellation pending". The completion would still get returned via io_getevents(), and userspace would find out if it was cancelled or not then. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/