[Qemu-block] Dynamic reconfiguration (was: qmp: add monitor command to add/remove a child)

Kevin Wolf Thu, 08 Oct 2015 04:04:55 -0700

Am 08.10.2015 um 08:15 hat Markus Armbruster geschrieben:
> Max Reitz <mre...@redhat.com> writes:
> > E.g. you may have a block filter in the future where you want to
> > exchange its child BDS. This exchange should be an atomic operation, so
> > we cannot use this interface there anyway. For quorum, such an exchange
> > does not need to be atomic, since you can just add the new child first
> > and remove the old one afterwards.
> >
> > So maybe in the future we get some block driver other than quorum for
> > which adding and removing children (as opposed to atomically exchanging
> > them) makes sense, but for now I can only see quorum. Therefore, that
> > this works for quorum only is in my opinion not a reason to make it
> > experimental. I think we actually want to keep it that way.
> 
> Are you telling us the existing interface is insufficiently general?
> That the general interface neeeds to support atomic replacement?
> 
> If yes, why isn't the general interface is what we should do for quorum?
> Delete is atomic replacement by nothing, add is atomic replacement of
> nothing.


The general thing is what we used to call "dynamic reconfiguration". If
we want a single command to enable it (which would be great if we
could), we need to some more design work first. Atomic replacement might
be the operation we're looking for, but we have to be sure.

So far we haven't thought about dynamic reconfiguation enough that we
would know the right solution, but enough that we know it's hard. That
would be an argument for me that makes adding an x-* command now
acceptable. On the other hand, the fact that we want a single command in
the end makes me want to keep it experimental.


What types of dynamic reconfiguration do we need to support? I'll start
with a small list, feel free to extend it:

* Replace a child node with another node. This works pretty much
  everywhere in the tree - including the root, i.e. BlockBackend!
  Working just on BDSes doesn't seem to be enough.

* Adding a child to a node that can take additional children (e.g.
  quorum can take an arbitrary number; but also COW image formats have
  an option child, so you could add a backing file to a node originally
  opened with BDRV_O_NO_BACKING)

  Same as atomically replacing nothing by a node.

* Removing an optional child from a node that remains valid with that
  child removed. The same examples apply.

  Same as atomically replacing a child by nothing.

* Add a new node between two existing nodes. An example is taking a live
  snapshot, which inserts a new node between the BlockBackend and the
  first BDS. Or it could be used to insert a filter somewhere in the
  graph.

  Same as creating the new node pointing to node B (or dynamically
  adding it) and then atomically replacing the reference of node A that
  pointed to B with a reference to the new node.

* Add a new node between multiple existing nodes. This is one of the
  examples we always used to discuss with dynamic reconfiguration:

      base <- sn1 <- sn2 <--+-- virtio-blk
                            |
                            +-- NBD server

  Adding a filter could result in this:

        base <- sn1 <- sn2 <- throttling <--+-- virtio-blk
                                          |
                                          +-- NBD server

  Or this:

        base <- sn1 <- sn2 <--+-- throttling <- virtio-blk
                              |
                              +-- NBD server

  Or this:

        base <- sn1 <- sn2 <--+-- virtio-blk
                              |
                              +-- throttling <- NBD server

  All of these are different kinds of "adding a filter", and all of
  them are valid operations that a user could want to perform.

  Case 2 and 3 are really just "add a new node between two existing
  nodes", as explained above.

  Case 1 is new: We still create the throttling node so that it already
  points to sn2, but now we have to atomically change the children of
  two things (the BlockBackends of virtio-blk and the NBD server). Not a
  problem per se because we can just do that, but it raises the question
  whether the atomic replacement operation needs to be transactionable.

* Remove a node between two (or more) other nodes.

  Same as atomically replacing a child by a grandchild. For more than
  two involved nodes, again a transactional version might be needed.

So at the first sight, this operation seems to work as the general
interface for dynamic reconfiguration.


One problem we discussed earlier that I'm not sure whether it's related
is filter nodes inserted automatically once we change e.g. the I/O
throttling QMP commands to add a throttling filter BDS to the graph.

If the user creates nodes A and B, but sets throttling options, they
might end up with a chain like this:

    A <- throttling <- B

Now imagine that they want to add another filter between A and B, let's
say blkdebug. They would need to know that they have to insert the new
node between A and throttling or B and throttling, but not between A and
B. If they tried to insert it between A and B, the algorithm above says
that they would let blkdebug point to A, and replace B's child with
blkdebug, the resulting tree wouldn't be throttled any more!

    A <- blkdebug <- B

         throttling (ref = 0 -> delete)

Even if they knew that they have to consider the throttling node, it
currently wouldn't have a node-name, and with Jeff's autogenerated names
it wouldn't be predictable.

Maybe the dynamic reconfiguration interface does need to be a bit
cleverer.


Anyway, after writing all of this, I'm almost convinced now that an
experimental interface is the right thing to do in this patch series.

Kevin

[Qemu-block] Dynamic reconfiguration (was: qmp: add monitor command to add/remove a child)

Reply via email to