Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben: > Max Reitz <mre...@redhat.com> writes: > > E.g. you may have a block filter in the future where you want to > > exchange its child BDS. This exchange should be an atomic operation, so > > we cannot use this interface there anyway. For quorum, such an exchange > > does not need to be atomic, since you can just add the new child first > > and remove the old one afterwards. > > > > So maybe in the future we get some block driver other than quorum for > > which adding and removing children (as opposed to atomically exchanging > > them) makes sense, but for now I can only see quorum. Therefore, that > > this works for quorum only is in my opinion not a reason to make it > > experimental. I think we actually want to keep it that way. > > Are you telling us the existing interface is insufficiently general? > That the general interface neeeds to support atomic replacement? > > If yes, why isn't the general interface is what we should do for quorum? > Delete is atomic replacement by nothing, add is atomic replacement of > nothing.
The general thing is what we used to call "dynamic reconfiguration". If we want a single command to enable it (which would be great if we could), we need to some more design work first. Atomic replacement might be the operation we're looking for, but we have to be sure. So far we haven't thought about dynamic reconfiguation enough that we would know the right solution, but enough that we know it's hard. That would be an argument for me that makes adding an x-* command now acceptable. On the other hand, the fact that we want a single command in the end makes me want to keep it experimental. What types of dynamic reconfiguration do we need to support? I'll start with a small list, feel free to extend it: * Replace a child node with another node. This works pretty much everywhere in the tree - including the root, i.e. BlockBackend! Working just on BDSes doesn't seem to be enough. * Adding a child to a node that can take additional children (e.g. quorum can take an arbitrary number; but also COW image formats have an option child, so you could add a backing file to a node originally opened with BDRV_O_NO_BACKING) Same as atomically replacing nothing by a node. * Removing an optional child from a node that remains valid with that child removed. The same examples apply. Same as atomically replacing a child by nothing. * Add a new node between two existing nodes. An example is taking a live snapshot, which inserts a new node between the BlockBackend and the first BDS. Or it could be used to insert a filter somewhere in the graph. Same as creating the new node pointing to node B (or dynamically adding it) and then atomically replacing the reference of node A that pointed to B with a reference to the new node. * Add a new node between multiple existing nodes. This is one of the examples we always used to discuss with dynamic reconfiguration: base <- sn1 <- sn2 <--+-- virtio-blk | +-- NBD server Adding a filter could result in this: base <- sn1 <- sn2 <- throttling <--+-- virtio-blk | +-- NBD server Or this: base <- sn1 <- sn2 <--+-- throttling <- virtio-blk | +-- NBD server Or this: base <- sn1 <- sn2 <--+-- virtio-blk | +-- throttling <- NBD server All of these are different kinds of "adding a filter", and all of them are valid operations that a user could want to perform. Case 2 and 3 are really just "add a new node between two existing nodes", as explained above. Case 1 is new: We still create the throttling node so that it already points to sn2, but now we have to atomically change the children of two things (the BlockBackends of virtio-blk and the NBD server). Not a problem per se because we can just do that, but it raises the question whether the atomic replacement operation needs to be transactionable. * Remove a node between two (or more) other nodes. Same as atomically replacing a child by a grandchild. For more than two involved nodes, again a transactional version might be needed. So at the first sight, this operation seems to work as the general interface for dynamic reconfiguration. One problem we discussed earlier that I'm not sure whether it's related is filter nodes inserted automatically once we change e.g. the I/O throttling QMP commands to add a throttling filter BDS to the graph. If the user creates nodes A and B, but sets throttling options, they might end up with a chain like this: A <- throttling <- B Now imagine that they want to add another filter between A and B, let's say blkdebug. They would need to know that they have to insert the new node between A and throttling or B and throttling, but not between A and B. If they tried to insert it between A and B, the algorithm above says that they would let blkdebug point to A, and replace B's child with blkdebug, the resulting tree wouldn't be throttled any more! A <- blkdebug <- B throttling (ref = 0 -> delete) Even if they knew that they have to consider the throttling node, it currently wouldn't have a node-name, and with Jeff's autogenerated names it wouldn't be predictable. Maybe the dynamic reconfiguration interface does need to be a bit cleverer. Anyway, after writing all of this, I'm almost convinced now that an experimental interface is the right thing to do in this patch series. Kevin