On Wed, Sep 10, 2014 at 04:54:19PM +0800, Fam Zheng wrote: > On Thu, 09/04 21:42, Stefan Hajnoczi wrote: > > On Tue, Aug 26, 2014 at 06:45:54AM +0000, Benoît Canet wrote: > > > On Tue, Aug 26, 2014 at 12:42:04PM +0800, Fam Zheng wrote: > > > > On Mon, 08/25 12:12, Benoît Canet wrote: > > > > > On Mon, Aug 25, 2014 at 05:37:37PM +0800, Fam Zheng wrote: > > > > > > On Mon, 08/25 09:06, Benoît Canet wrote: > > > > > > > On Mon, Aug 25, 2014 at 02:04:24PM +0800, Fam Zheng wrote: > > > > > > > > On Fri, 08/22 18:11, Benoît Canet wrote: > > > > > > > > > Since the block layer code is starting to modify the BDS > > > > > > > > > graph right in the > > > > > > > > > middle of BDS chains (block-mirror's replace parameter for > > > > > > > > > example) QEMU needs > > > > > > > > > to properly block and unblock whole BDS subtrees; recursion > > > > > > > > > is a neat way to > > > > > > > > > achieve this task. > > > > > > > > > > > > > > > > > > This patch also takes care of modifying the op blockers users. > > > > > > > > > > > > > > > > Is this going to replace backing_blocker? > > > > > > > > > > > > > > > > I think it is too general an approach to control the operation > > > > > > > > properly, > > > > > > > > because the op blocker may not work in the same way for all > > > > > > > > types of BDS > > > > > > > > connections. In other words, the choosing of op blockers are > > > > > > > > likely > > > > > > > > conditional on graph edge types, that's why backing_blocker was > > > > > > > > added here. For > > > > > > > > example, A VMDK extent connection will probably need a > > > > > > > > different set of > > > > > > > > blockers than bs->file connection. > > > > > > > > > > > > > > > > So could you explain in which cases is the recursive > > > > > > > > blocking/unblocking > > > > > > > > useful? > > > > > > > > > > > > > > It's designed for the new crop of block operations operating on > > > > > > > BDS located in > > > > > > > the middle of the backing chain: Jeff's patches, intermediate > > > > > > > live streaming or > > > > > > > intermediate mirroring. > > > > > > > Recursively blocking BDS allows to do these operations safely. > > > > > > > > > > > > Sorry I may be slow on this, but it's still not clear to me. > > > > > > > > > > > > That doesn't immediately show how backing_blocker doesn't work. > > > > > > These > > > > > > operations are in the category of operations that update graph > > > > > > topology, > > > > > > meaning that they drop, add or swap some nodes in the middle of the > > > > > > chain. It > > > > > > is not safe because there are used by the other nodes, but they are > > > > > > supposed to > > > > > > be protected by backing_blocker. Could you be more specific? > > > > > > > > > > I don't know particularly about the backing blocker case. > > > > > > > > > > > > > > > > > I can think of something more than backing_hd: there are also link > > > > > > types other > > > > > > than backing_hd, for example ->file, (vmdk)->extents or > > > > > > (quorum)->qcrs, etc. > > > > > > They should be protected as well. > > > > > > > > > > This patch takes cares of recursing everywhere. > > > > > > > > > > I can give you an example for quorum. > > > > > > > > > > If a streaming operation is running on a quorum block backend the > > > > > recursive > > > > > blocking will help to block any operation done directly on any of the > > > > > children. > > > > > > > > At what points should block layer recursively block/unblock the > > > > operations in > > > > this quorum case? > > > > > > When the streaming starts it should block all the top bs children. > > > So after when an operation tries to operate on a child of the top bs it > > > will be > > > forbidden. > > > > > > The beauty of it is that recursive blockers can easily replace regular > > > blockers. > > > > Let's think of a situation that recursive blockers protect but > > backing_blocker does not: > > > > a <- b <- c <- d > > > > c is the backing file and is therefore protected by the op blocker. > > > > The block-commit command works with node-names, however, so we can > > manipulate any nodes in the graph, not just the topmost one. Try this: > > > > block-commit d > > block-commit b > > > > I haven't checked yet but I suspect it will launch two block-commit jobs > > on the same partial chain (that's a bad thing because it can lead to > > corruption). > > 1) Does block-commit work with node-names already? In other words, is > block-commit b possible now? I only see drive-mirror works with it, but not > drive-backup, block-mirror or block-commit. > > 2) Regardless of the answer to 1), I think we could use a similar approach as > drive-backup here: split BLOCK_OP_TYPE_COMMIT to > BLOCK_OP_TYPE_COMMIT_{SOURCE,TARGET}, and only unblock > BLOCK_OP_TYPE_COMMIT_TARGET in bdrv_set_backing_hd.
I'll try to implement this while being at it. > > Unblocking BLOCK_OP_TYPE_COMMIT on backing_hd is wrong as long as we expose > "d" > to block-commit, with node-name. > > As the next step, let's think about what it takes to safely allow commit d to > b > with: > > a <- b <- c <- d <- e <- f <- g > > I think the answer is: if the commit job checks and sets blockers on range > [d, b], > it is safe. Because from e, f and g's points of view, the whole backing chain > is always complete and working, just like how device model thinks of the top > node > when it's being active committed nowadays. > > Based on that, we can even start commit on non-intersecting partial chains at > the same time: > > block-commit g up to e > block-commit d up to b > > , if we want to. > > That is only possible with a more specific blocking logic, instead of > recursive > blocker. Here your advice is the same as Kevin's one: make op blocker work on ranges. I'll change them to do so. Thanks for the input. Benoît > > Fam > > > > > With recursive blockers, not just c but also a and b are protected. > > > > To me, this demonstrates that recursive op blockers are safer than > > non-recursive backing_blocker and will prevent real-world cases that > > could lead to corruption. > > > > BTW I'm fairly happy with the patch itself, but like Fam I'm still > > thinking through different cases to convince myself the design is sound. > > > > Stefan > >