Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-08-02 Thread Stefan Hajnoczi
On Fri, Jul 28, 2017 at 01:55:38PM +0200, Kevin Wolf wrote:
> Am 28.07.2017 um 00:09 hat John Snow geschrieben:
> > On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote:
> > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > I think I agree with Stefan's concerns that this is more of a
> > temporary workaround for the benefit of jobs, but that there may well
> > be other reasons (especially in the future) where graph manipulations
> > may occur invisibly to the user.
> 
> We should do our best to avoid this. Implicit graph changes only make
> libvirt's life hard. I agree that we'll have many more graph changes in
> the future, but let's keep them as explicit as possible.

Speaking of implicit graph changes: the COLO replication driver
(block/replication.c) uses the backup job internally and that requires
before write notifiers.  That means there is an implicit graph change
when COLO decides to complete the backup job.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-08-01 Thread Manos Pitsidianakis

On Tue, Aug 01, 2017 at 03:50:36PM +0200, Kevin Wolf wrote:

Am 31.07.2017 um 19:30 hat Manos Pitsidianakis geschrieben:

On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote:
> Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
> > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > > > This proposal follows a discussion we had with Kevin and Stefan on 
filter
> > > > > node management.
> > > > >
> > > > > With block filter drivers arises a need to configure filter nodes on 
runtime
> > > > > with QMP on live graphs. A problem with doing live graph 
modifications is
> > > > > that some block jobs modify the graph when they are done and don't 
operate
> > > > > under any synchronisation, resulting in a race condition if we try to 
insert
> > > > > a filter node in the place of an existing edge.
> > > >
> > > > But maybe you are thinking about higher level race conditions between
> > > > QMP commands.  Can you give an example of the race?
> > >
> > > Exactly, synchronisation between the QMP/user actions. Here's an example,
> > > correct me if I'm wrong:
> > >
> > > User issues a block-commit to this tree to commit A onto B:
> > >BB -> A -> B
> > >
> > > This inserts the dummy mirror node at the top since this is a mirror job
> > > (active commit)
> > >BB -> dummy -> A -> B
> > >
> > > User wants to add a filter node F below the BB. First we create the node:
> > >BB -> dummy -> A -> B
> > >   F /
> > >
> > > Commit finishes before we do block-insert-node, dummy and A is removed 
from
> > > the BB's chain, mirror_exit calls bdrv_replace_node()
> > >
> > >BB > B
> > >F -> /
> > >
> > > Now inserting F under BB will error since dummy does not exist any more.
> >
> > I see the diagrams but the flow isn't clear without the user's QMP
> > commands.
> >
> > F is created using blockdev-add?  What are the parameters and how does
> > it know about dummy?  I think this is an interesting question in itself
> > because dummy is a transient internal node that QMP users don't
> > necessarily know about.
>
> We can expect that users of block-insert-node also know about block job
> filter nodes, simply because the former is newer than the latter.
>
> (I also don't like the name "dummy", this makes it sound like it's not
> really a proper node. In reality, there is little reason to treat it
> specially.)
>
> > What is the full block-insert-node command?
> >
> > > The proposal doesn't guard against the following:
> > > User issues a block-commit to this tree to commit B onto C:
> > >BB -> A -> B -> C
> > >
> > > The dummy node (commit's top bs is A):
> > >BB -> A -> dummy -> B -> C
> > >
> > > blockdev-add of a filter we wish to eventually be between A and C:
> > >BB -> A -> dummy -> B -> C
> > > F/
> > >
> > > if commit finishes before we issue block-insert-node (commit_complete()
> > > calls bdrv_set_backing_hd() which only touches A)
> > >BB -> A --> C
> > >   F -> dummy -> B /
> > >resulting in error when issuing block-insert-node,
> > >
> > > else if we set the job to manual-delete and insert F:
> > >BB -> A -> F -> dummy -> B -> C
> > > deleting the job will result in F being lost since the job's top bs wasn't
> > > updated.
> >
> > manual-delete is a solution for block jobs.  The write threshold
> > feature is a plain QMP command that in the future will create a
> > transient internal node (before write notifier).
>
> Yes, that becomes a legacy QMP command then. The new way is blockdev-add
> and block-insert-node for write threshold, too.
>
> Apart from that, a write threshold node never disappears by itself, it
> is only inserted or removed in the context of a QMP command. That makes
> it a lot less dangerous than automatic block completion.
>
> > I'm not sure it makes sense to turn the write threshold feature into a
> > block job.  I guess it could work, but it seems a little unnatural and
> > maybe there will be other features that use transient internal nodes.
>
> Yeah, no reason to do so. Just a manually created filter node.
>

Can you explain what you have in mind? The current workflow is using
block-set-write-threshold on the targetted bs. If we want write threshold to
be on node level we would want a new filter driver so that it can take
options special to write-threshold.


Yes, I think that's what we want.


Unless we make the notify filter be a write threshold by default, and
when using it internally by the backup job, disable the threshold and
add our backup notifier to the node's list. Of course the current
write threshold code could be refactored into a new driver so that it
doesn't have to rely on notifiers.


As discussed on IRC, we shouldn't implement a bad interface (which
notifiers are, they bypass normal block device c

Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-08-01 Thread Kevin Wolf
Am 31.07.2017 um 19:30 hat Manos Pitsidianakis geschrieben:
> On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote:
> > Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
> > > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > > > > This proposal follows a discussion we had with Kevin and Stefan on 
> > > > > > filter
> > > > > > node management.
> > > > > >
> > > > > > With block filter drivers arises a need to configure filter nodes 
> > > > > > on runtime
> > > > > > with QMP on live graphs. A problem with doing live graph 
> > > > > > modifications is
> > > > > > that some block jobs modify the graph when they are done and don't 
> > > > > > operate
> > > > > > under any synchronisation, resulting in a race condition if we try 
> > > > > > to insert
> > > > > > a filter node in the place of an existing edge.
> > > > >
> > > > > But maybe you are thinking about higher level race conditions between
> > > > > QMP commands.  Can you give an example of the race?
> > > >
> > > > Exactly, synchronisation between the QMP/user actions. Here's an 
> > > > example,
> > > > correct me if I'm wrong:
> > > >
> > > > User issues a block-commit to this tree to commit A onto B:
> > > >BB -> A -> B
> > > >
> > > > This inserts the dummy mirror node at the top since this is a mirror job
> > > > (active commit)
> > > >BB -> dummy -> A -> B
> > > >
> > > > User wants to add a filter node F below the BB. First we create the 
> > > > node:
> > > >BB -> dummy -> A -> B
> > > >   F /
> > > >
> > > > Commit finishes before we do block-insert-node, dummy and A is removed 
> > > > from
> > > > the BB's chain, mirror_exit calls bdrv_replace_node()
> > > >
> > > >BB > B
> > > >F -> /
> > > >
> > > > Now inserting F under BB will error since dummy does not exist any more.
> > > 
> > > I see the diagrams but the flow isn't clear without the user's QMP
> > > commands.
> > > 
> > > F is created using blockdev-add?  What are the parameters and how does
> > > it know about dummy?  I think this is an interesting question in itself
> > > because dummy is a transient internal node that QMP users don't
> > > necessarily know about.
> > 
> > We can expect that users of block-insert-node also know about block job
> > filter nodes, simply because the former is newer than the latter.
> > 
> > (I also don't like the name "dummy", this makes it sound like it's not
> > really a proper node. In reality, there is little reason to treat it
> > specially.)
> > 
> > > What is the full block-insert-node command?
> > > 
> > > > The proposal doesn't guard against the following:
> > > > User issues a block-commit to this tree to commit B onto C:
> > > >BB -> A -> B -> C
> > > >
> > > > The dummy node (commit's top bs is A):
> > > >BB -> A -> dummy -> B -> C
> > > >
> > > > blockdev-add of a filter we wish to eventually be between A and C:
> > > >BB -> A -> dummy -> B -> C
> > > > F/
> > > >
> > > > if commit finishes before we issue block-insert-node (commit_complete()
> > > > calls bdrv_set_backing_hd() which only touches A)
> > > >BB -> A --> C
> > > >   F -> dummy -> B /
> > > >resulting in error when issuing block-insert-node,
> > > >
> > > > else if we set the job to manual-delete and insert F:
> > > >BB -> A -> F -> dummy -> B -> C
> > > > deleting the job will result in F being lost since the job's top bs 
> > > > wasn't
> > > > updated.
> > > 
> > > manual-delete is a solution for block jobs.  The write threshold
> > > feature is a plain QMP command that in the future will create a
> > > transient internal node (before write notifier).
> > 
> > Yes, that becomes a legacy QMP command then. The new way is blockdev-add
> > and block-insert-node for write threshold, too.
> > 
> > Apart from that, a write threshold node never disappears by itself, it
> > is only inserted or removed in the context of a QMP command. That makes
> > it a lot less dangerous than automatic block completion.
> > 
> > > I'm not sure it makes sense to turn the write threshold feature into a
> > > block job.  I guess it could work, but it seems a little unnatural and
> > > maybe there will be other features that use transient internal nodes.
> > 
> > Yeah, no reason to do so. Just a manually created filter node.
> > 
> 
> Can you explain what you have in mind? The current workflow is using
> block-set-write-threshold on the targetted bs. If we want write threshold to
> be on node level we would want a new filter driver so that it can take
> options special to write-threshold.

Yes, I think that's what we want.

> Unless we make the notify filter be a write threshold by default, and
> when using it internally by the backup job, disable the threshold and
> add our backup notifier to the node's li

Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-31 Thread Manos Pitsidianakis

On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote:

Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:

On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > This proposal follows a discussion we had with Kevin and Stefan on filter
> > > node management.
> > >
> > > With block filter drivers arises a need to configure filter nodes on 
runtime
> > > with QMP on live graphs. A problem with doing live graph modifications is
> > > that some block jobs modify the graph when they are done and don't operate
> > > under any synchronisation, resulting in a race condition if we try to 
insert
> > > a filter node in the place of an existing edge.
> >
> > But maybe you are thinking about higher level race conditions between
> > QMP commands.  Can you give an example of the race?
>
> Exactly, synchronisation between the QMP/user actions. Here's an example,
> correct me if I'm wrong:
>
> User issues a block-commit to this tree to commit A onto B:
>BB -> A -> B
>
> This inserts the dummy mirror node at the top since this is a mirror job
> (active commit)
>BB -> dummy -> A -> B
>
> User wants to add a filter node F below the BB. First we create the node:
>BB -> dummy -> A -> B
>   F /
>
> Commit finishes before we do block-insert-node, dummy and A is removed from
> the BB's chain, mirror_exit calls bdrv_replace_node()
>
>BB > B
>F -> /
>
> Now inserting F under BB will error since dummy does not exist any more.

I see the diagrams but the flow isn't clear without the user's QMP
commands.

F is created using blockdev-add?  What are the parameters and how does
it know about dummy?  I think this is an interesting question in itself
because dummy is a transient internal node that QMP users don't
necessarily know about.


We can expect that users of block-insert-node also know about block job
filter nodes, simply because the former is newer than the latter.

(I also don't like the name "dummy", this makes it sound like it's not
really a proper node. In reality, there is little reason to treat it
specially.)


What is the full block-insert-node command?

> The proposal doesn't guard against the following:
> User issues a block-commit to this tree to commit B onto C:
>BB -> A -> B -> C
>
> The dummy node (commit's top bs is A):
>BB -> A -> dummy -> B -> C
>
> blockdev-add of a filter we wish to eventually be between A and C:
>BB -> A -> dummy -> B -> C
> F/
>
> if commit finishes before we issue block-insert-node (commit_complete()
> calls bdrv_set_backing_hd() which only touches A)
>BB -> A --> C
>   F -> dummy -> B /
>resulting in error when issuing block-insert-node,
>
> else if we set the job to manual-delete and insert F:
>BB -> A -> F -> dummy -> B -> C
> deleting the job will result in F being lost since the job's top bs wasn't
> updated.

manual-delete is a solution for block jobs.  The write threshold
feature is a plain QMP command that in the future will create a
transient internal node (before write notifier).


Yes, that becomes a legacy QMP command then. The new way is blockdev-add
and block-insert-node for write threshold, too.

Apart from that, a write threshold node never disappears by itself, it
is only inserted or removed in the context of a QMP command. That makes
it a lot less dangerous than automatic block completion.


I'm not sure it makes sense to turn the write threshold feature into a
block job.  I guess it could work, but it seems a little unnatural and
maybe there will be other features that use transient internal nodes.


Yeah, no reason to do so. Just a manually created filter node.



Can you explain what you have in mind? The current workflow is using 
block-set-write-threshold on the targetted bs. If we want write 
threshold to be on node level we would want a new filter driver so that 
it can take options special to write-threshold. Unless we make the 
notify filter be a write threshold by default, and when using it 
internally by the backup job, disable the threshold and add our backup 
notifier to the node's list. Of course the current write threshold code 
could be refactored into a new driver so that it doesn't have to rely on 
notifiers.


The way I've currently done the conversion is to add an implicit filter 
when enabling the threshold, just like in backup jobs.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-31 Thread Stefan Hajnoczi
On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote:
> Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
> > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > > > This proposal follows a discussion we had with Kevin and Stefan on 
> > > > > filter
> > > > > node management.
> > > > > 
> > > > > With block filter drivers arises a need to configure filter nodes on 
> > > > > runtime
> > > > > with QMP on live graphs. A problem with doing live graph 
> > > > > modifications is
> > > > > that some block jobs modify the graph when they are done and don't 
> > > > > operate
> > > > > under any synchronisation, resulting in a race condition if we try to 
> > > > > insert
> > > > > a filter node in the place of an existing edge.
> > > > 
> > > > But maybe you are thinking about higher level race conditions between
> > > > QMP commands.  Can you give an example of the race?
> > > 
> > > Exactly, synchronisation between the QMP/user actions. Here's an example,
> > > correct me if I'm wrong:
> > > 
> > > User issues a block-commit to this tree to commit A onto B:
> > >BB -> A -> B
> > > 
> > > This inserts the dummy mirror node at the top since this is a mirror job
> > > (active commit)
> > >BB -> dummy -> A -> B
> > > 
> > > User wants to add a filter node F below the BB. First we create the node:
> > >BB -> dummy -> A -> B
> > >   F /
> > > 
> > > Commit finishes before we do block-insert-node, dummy and A is removed 
> > > from
> > > the BB's chain, mirror_exit calls bdrv_replace_node()
> > > 
> > >BB > B
> > >F -> /
> > > 
> > > Now inserting F under BB will error since dummy does not exist any more.
> > 
> > I see the diagrams but the flow isn't clear without the user's QMP
> > commands.
> > 
> > F is created using blockdev-add?  What are the parameters and how does
> > it know about dummy?  I think this is an interesting question in itself
> > because dummy is a transient internal node that QMP users don't
> > necessarily know about.
> 
> We can expect that users of block-insert-node also know about block job
> filter nodes, simply because the former is newer than the latter.
> 
> (I also don't like the name "dummy", this makes it sound like it's not
> really a proper node. In reality, there is little reason to treat it
> specially.)
> 
> > What is the full block-insert-node command?
> > 
> > > The proposal doesn't guard against the following:
> > > User issues a block-commit to this tree to commit B onto C:
> > >BB -> A -> B -> C
> > > 
> > > The dummy node (commit's top bs is A):
> > >BB -> A -> dummy -> B -> C
> > > 
> > > blockdev-add of a filter we wish to eventually be between A and C:
> > >BB -> A -> dummy -> B -> C
> > > F/
> > > 
> > > if commit finishes before we issue block-insert-node (commit_complete()
> > > calls bdrv_set_backing_hd() which only touches A)
> > >BB -> A --> C
> > >   F -> dummy -> B /
> > >resulting in error when issuing block-insert-node,
> > > 
> > > else if we set the job to manual-delete and insert F:
> > >BB -> A -> F -> dummy -> B -> C
> > > deleting the job will result in F being lost since the job's top bs wasn't
> > > updated.
> > 
> > manual-delete is a solution for block jobs.  The write threshold
> > feature is a plain QMP command that in the future will create a
> > transient internal node (before write notifier).
> 
> Yes, that becomes a legacy QMP command then. The new way is blockdev-add
> and block-insert-node for write threshold, too.
> 
> Apart from that, a write threshold node never disappears by itself, it
> is only inserted or removed in the context of a QMP command. That makes
> it a lot less dangerous than automatic block completion.

Nice thinking - write threshold can be a node too!


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-28 Thread Kevin Wolf
Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
> On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > > This proposal follows a discussion we had with Kevin and Stefan on 
> > > > filter
> > > > node management.
> > > > 
> > > > With block filter drivers arises a need to configure filter nodes on 
> > > > runtime
> > > > with QMP on live graphs. A problem with doing live graph modifications 
> > > > is
> > > > that some block jobs modify the graph when they are done and don't 
> > > > operate
> > > > under any synchronisation, resulting in a race condition if we try to 
> > > > insert
> > > > a filter node in the place of an existing edge.
> > > 
> > > But maybe you are thinking about higher level race conditions between
> > > QMP commands.  Can you give an example of the race?
> > 
> > Exactly, synchronisation between the QMP/user actions. Here's an example,
> > correct me if I'm wrong:
> > 
> > User issues a block-commit to this tree to commit A onto B:
> >BB -> A -> B
> > 
> > This inserts the dummy mirror node at the top since this is a mirror job
> > (active commit)
> >BB -> dummy -> A -> B
> > 
> > User wants to add a filter node F below the BB. First we create the node:
> >BB -> dummy -> A -> B
> >   F /
> > 
> > Commit finishes before we do block-insert-node, dummy and A is removed from
> > the BB's chain, mirror_exit calls bdrv_replace_node()
> > 
> >BB > B
> >F -> /
> > 
> > Now inserting F under BB will error since dummy does not exist any more.
> 
> I see the diagrams but the flow isn't clear without the user's QMP
> commands.
> 
> F is created using blockdev-add?  What are the parameters and how does
> it know about dummy?  I think this is an interesting question in itself
> because dummy is a transient internal node that QMP users don't
> necessarily know about.

We can expect that users of block-insert-node also know about block job
filter nodes, simply because the former is newer than the latter.

(I also don't like the name "dummy", this makes it sound like it's not
really a proper node. In reality, there is little reason to treat it
specially.)

> What is the full block-insert-node command?
> 
> > The proposal doesn't guard against the following:
> > User issues a block-commit to this tree to commit B onto C:
> >BB -> A -> B -> C
> > 
> > The dummy node (commit's top bs is A):
> >BB -> A -> dummy -> B -> C
> > 
> > blockdev-add of a filter we wish to eventually be between A and C:
> >BB -> A -> dummy -> B -> C
> > F/
> > 
> > if commit finishes before we issue block-insert-node (commit_complete()
> > calls bdrv_set_backing_hd() which only touches A)
> >BB -> A --> C
> >   F -> dummy -> B /
> >resulting in error when issuing block-insert-node,
> > 
> > else if we set the job to manual-delete and insert F:
> >BB -> A -> F -> dummy -> B -> C
> > deleting the job will result in F being lost since the job's top bs wasn't
> > updated.
> 
> manual-delete is a solution for block jobs.  The write threshold
> feature is a plain QMP command that in the future will create a
> transient internal node (before write notifier).

Yes, that becomes a legacy QMP command then. The new way is blockdev-add
and block-insert-node for write threshold, too.

Apart from that, a write threshold node never disappears by itself, it
is only inserted or removed in the context of a QMP command. That makes
it a lot less dangerous than automatic block completion.

> I'm not sure it makes sense to turn the write threshold feature into a
> block job.  I guess it could work, but it seems a little unnatural and
> maybe there will be other features that use transient internal nodes.

Yeah, no reason to do so. Just a manually created filter node.

> What I'm getting at is that there might be alternative to manual-delete
> that work in the general case.  For example, if blockdev-add +
> block-insert-node are part of a QMP 'transaction' command, can we lock
> the graph to protect it against modifications?

We thought about this at the last STR meeting. However, not for much
longer than a minute. Markus and I both agreed that we wouldn't be the
ones to try and make blockdev-add transactionable. It sounds like a huge
amount of work and a high risk of bugs for little benefit.

Let's just make sure that we provide a way for management tools to avoid
any surprise changes in the block graph.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-28 Thread Kevin Wolf
Am 28.07.2017 um 00:09 hat John Snow geschrieben:
> On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote:
> > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > > This proposal follows a discussion we had with Kevin and Stefan
> > > > on filter
> > > > node management.
> > > > 
> > > > With block filter drivers arises a need to configure filter
> > > > nodes on runtime
> > > > with QMP on live graphs. A problem with doing live graph
> > > > modifications is
> > > > that some block jobs modify the graph when they are done and
> > > > don't operate
> > > > under any synchronisation, resulting in a race condition if we
> > > > try to insert
> > > > a filter node in the place of an existing edge.
> > > 
> > > But maybe you are thinking about higher level race conditions between
> > > QMP commands.  Can you give an example of the race?
> > 
> > Exactly, synchronisation between the QMP/user actions. Here's an
> > example, correct me if I'm wrong:
> > 
> > User issues a block-commit to this tree to commit A onto B:
> > BB -> A -> B
> > 
> > This inserts the dummy mirror node at the top since this is a mirror job
> > (active commit)
> > BB -> dummy -> A -> B
> > 
> > User wants to add a filter node F below the BB. First we create the node:
> > BB -> dummy -> A -> B
> >F /
> > 
> > Commit finishes before we do block-insert-node, dummy and A is removed
> > from the BB's chain, mirror_exit calls bdrv_replace_node()
> > 
> > BB > B
> > F -> /
> > 
> > Now inserting F under BB will error since dummy does not exist any more.

So the basic scenario is the one from the last section of the STR block
meeting notes:

http://lists.gnu.org/archive/html/qemu-block/2016-12/msg00045.html

At the time, this scenario seemed to make sense, but in fact, as your
explanation show, it's not a real problem. The theory that inserting the
filter would bring the mirror_top node back is false because removing
the mirror_top node updates _all_ of its parents (including the new
throttling node) rather than just the BB to point to the BDS below.

However, whether block-insert-node actually fails depends on how you
address the edge(s) on which you want to insert a filter node. We would
only get an error if you specify it as "insert filter above this node".
However, this is not a way to unambiguously reference a single edge, you
could only update _all_ incoming edges for a node at once. Therefore I
don't think this is a good way to do things.

An unambiguous way to identify a single edge is the name of the parent
node (usually a node-name, but possibly a BB name - they share a single
namespace, so having one QMP argument for both could be enough) and of
the child (as in BdrvChild), e.g. "insert as new child=backing of
node=myimage",

This way, the example scenario would actually just work.

> Exactly how much of a problem is this? You, the user, have instructed QEMU
> to do two conflicting things:
> 
> (1) Squash A down into B, and
> (2) Insert a filter above A
> 
> Shame on you for deleting the node you were trying to reference, no? And as
> an added bonus, QEMU will even error out on your command instead of trying
> to do something and getting it wrong.
> 
> Is this a problem?

I don't think these two actions can reasonably be considered
conflicting, adding a filter on top doesn't influence the commit
operation at all.

We used to just forbid everything while doing block jobs, but these
operations are block jobs because they are long-running, and just
forbidding everything for a long time isn't very helpful. There is no
reason why a user shouldn't be able to take a snapshot while they are
creating a backup, etc.

As I said above, filter nodes automagically going away at some point
without an explicit user command isn't necessarily a problem, but I
also don't feel very comfortable with it. It certainly feels like an
easy way to introduce bugs. Things should become much easier to manage
when all graph modifications are explicit.

I'm surprised that you seem to be opposed to an explicit job-delete now
as we already discussed this before and it would solve several problems:

* Block job completion issues only an event. If the management tools
  reconnects, it has no way to get the same information with a query-*
  command. If jobs stay around until job-delete, this is not a problem.

* Completion is usually rather complicated with all of the graph changes
  and involves operations that can fail, but we don't have a way to
  actually report errors to the user. job-delete could do that.

* And finally the management problem we're discussing here. Yes, if I
  want to snapshot below a mirror node and the mirror goes away
  immediately before I send the command, libvirt will just get an error
  and can process all events and update its graph view and then retry
  with another node name.

  Though even if it's po

Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-28 Thread Manos Pitsidianakis

On Thu, Jul 27, 2017 at 06:09:04PM -0400, John Snow wrote:



On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote:

On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:

On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
This proposal follows a discussion we had with Kevin and Stefan 
on filter

node management.

With block filter drivers arises a need to configure filter 
nodes on runtime
with QMP on live graphs. A problem with doing live graph 
modifications is
that some block jobs modify the graph when they are done and 
don't operate
under any synchronisation, resulting in a race condition if we 
try to insert

a filter node in the place of an existing edge.


But maybe you are thinking about higher level race conditions between
QMP commands.  Can you give an example of the race?


Exactly, synchronisation between the QMP/user actions. Here's an 
example, correct me if I'm wrong:


User issues a block-commit to this tree to commit A onto B:
   BB -> A -> B

This inserts the dummy mirror node at the top since this is a mirror 
job (active commit)

   BB -> dummy -> A -> B

User wants to add a filter node F below the BB. First we create the node:
   BB -> dummy -> A -> B
  F /

Commit finishes before we do block-insert-node, dummy and A is 
removed from the BB's chain, mirror_exit calls bdrv_replace_node()


   BB > B
   F -> /

Now inserting F under BB will error since dummy does not exist any more.



Exactly how much of a problem is this? You, the user, have instructed 
QEMU to do two conflicting things:


(1) Squash A down into B, and
(2) Insert a filter above A

Shame on you for deleting the node you were trying to reference, no? 
And as an added bonus, QEMU will even error out on your command 
instead of trying to do something and getting it wrong.


Is this a problem?






The proposal doesn't guard against the following:
User issues a block-commit to this tree to commit B onto C:
   BB -> A -> B -> C

The dummy node (commit's top bs is A):
   BB -> A -> dummy -> B -> C

blockdev-add of a filter we wish to eventually be between A and C:
   BB -> A -> dummy -> B -> C
F/

if commit finishes before we issue block-insert-node (commit_complete()
calls bdrv_set_backing_hd() which only touches A)
   BB -> A --> C
  F -> dummy -> B /
   resulting in error when issuing block-insert-node,


Because "B" and "dummy" have already actually been deleted, I assume. 
(The diagram is confusing.)


because we have an edge (A, C) and C is not a child of F, so we cannot 
insert it as a child to A.  (I rotated the usual ascii art by 90 deg to 
save me from handling whitespace, but I might have done it worse)


else if we set the job to manual-delete and insert F:
   BB -> A -> F -> dummy -> B -> C
deleting the job will result in F being lost since the job's top bs 
wasn't updated.




Seems like a fairly good reason not to pursue this avenue then, yes?


I think this is just a different problem, not a race but of broadcasting 
graph operations to affected parties (block jobs).


I think I agree with Stefan's concerns that this is more of a 
temporary workaround for the benefit of jobs, but that there may well 
be other reasons (especially in the future) where graph manipulations 
may occur invisibly to the user.


Do you have any use cases for failure here that don't involve the user 
themselves asking for two conflicting things? That is, QEMU doing some 
graph reorganization without the user's knowledge coupled with the 
user asking for a new filter node?


Are there any failures that result in anything more dangerous or bad 
than a "404: node not found" style error where we simply reject the 
premise of what the user was attempting to accomplish?




I couldn't come up with a case that caused catastrophic failure, no 
(though I lack familiarity with block jobs). If this isn't a problem 
after all, good news.


block-insert-node needs to work with no dangerous conflicts to other 
graph operations, basically. Is there a case where there's not just an 
error and we can't recover?


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-27 Thread John Snow



On 07/26/2017 02:23 PM, Manos Pitsidianakis wrote:

On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:

On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
This proposal follows a discussion we had with Kevin and Stefan on 
filter

node management.

With block filter drivers arises a need to configure filter nodes on 
runtime
with QMP on live graphs. A problem with doing live graph 
modifications is
that some block jobs modify the graph when they are done and don't 
operate
under any synchronisation, resulting in a race condition if we try to 
insert

a filter node in the place of an existing edge.


But maybe you are thinking about higher level race conditions between
QMP commands.  Can you give an example of the race?


Exactly, synchronisation between the QMP/user actions. Here's an 
example, correct me if I'm wrong:


User issues a block-commit to this tree to commit A onto B:
BB -> A -> B

This inserts the dummy mirror node at the top since this is a mirror job 
(active commit)

BB -> dummy -> A -> B

User wants to add a filter node F below the BB. First we create the node:
BB -> dummy -> A -> B
   F /

Commit finishes before we do block-insert-node, dummy and A is removed 
from the BB's chain, mirror_exit calls bdrv_replace_node()


BB > B
F -> /

Now inserting F under BB will error since dummy does not exist any more.



Exactly how much of a problem is this? You, the user, have instructed 
QEMU to do two conflicting things:


(1) Squash A down into B, and
(2) Insert a filter above A

Shame on you for deleting the node you were trying to reference, no? And 
as an added bonus, QEMU will even error out on your command instead of 
trying to do something and getting it wrong.


Is this a problem?


The proposal doesn't guard against the following:
User issues a block-commit to this tree to commit B onto C:
BB -> A -> B -> C

The dummy node (commit's top bs is A):
BB -> A -> dummy -> B -> C

blockdev-add of a filter we wish to eventually be between A and C:
BB -> A -> dummy -> B -> C
 F/

if commit finishes before we issue block-insert-node (commit_complete()
calls bdrv_set_backing_hd() which only touches A)
BB -> A --> C
   F -> dummy -> B /
resulting in error when issuing block-insert-node,


Because "B" and "dummy" have already actually been deleted, I assume. 
(The diagram is confusing.)




else if we set the job to manual-delete and insert F:
BB -> A -> F -> dummy -> B -> C
deleting the job will result in F being lost since the job's top bs 
wasn't updated.




Seems like a fairly good reason not to pursue this avenue then, yes?

I think I agree with Stefan's concerns that this is more of a temporary 
workaround for the benefit of jobs, but that there may well be other 
reasons (especially in the future) where graph manipulations may occur 
invisibly to the user.


Do you have any use cases for failure here that don't involve the user 
themselves asking for two conflicting things? That is, QEMU doing some 
graph reorganization without the user's knowledge coupled with the user 
asking for a new filter node?


Are there any failures that result in anything more dangerous or bad 
than a "404: node not found" style error where we simply reject the 
premise of what the user was attempting to accomplish?






Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-27 Thread Stefan Hajnoczi
On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
> On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
> > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> > > This proposal follows a discussion we had with Kevin and Stefan on filter
> > > node management.
> > > 
> > > With block filter drivers arises a need to configure filter nodes on 
> > > runtime
> > > with QMP on live graphs. A problem with doing live graph modifications is
> > > that some block jobs modify the graph when they are done and don't operate
> > > under any synchronisation, resulting in a race condition if we try to 
> > > insert
> > > a filter node in the place of an existing edge.
> > 
> > But maybe you are thinking about higher level race conditions between
> > QMP commands.  Can you give an example of the race?
> 
> Exactly, synchronisation between the QMP/user actions. Here's an example,
> correct me if I'm wrong:
> 
> User issues a block-commit to this tree to commit A onto B:
>BB -> A -> B
> 
> This inserts the dummy mirror node at the top since this is a mirror job
> (active commit)
>BB -> dummy -> A -> B
> 
> User wants to add a filter node F below the BB. First we create the node:
>BB -> dummy -> A -> B
>   F /
> 
> Commit finishes before we do block-insert-node, dummy and A is removed from
> the BB's chain, mirror_exit calls bdrv_replace_node()
> 
>BB > B
>F -> /
> 
> Now inserting F under BB will error since dummy does not exist any more.

I see the diagrams but the flow isn't clear without the user's QMP
commands.

F is created using blockdev-add?  What are the parameters and how does
it know about dummy?  I think this is an interesting question in itself
because dummy is a transient internal node that QMP users don't
necessarily know about.

What is the full block-insert-node command?

> The proposal doesn't guard against the following:
> User issues a block-commit to this tree to commit B onto C:
>BB -> A -> B -> C
> 
> The dummy node (commit's top bs is A):
>BB -> A -> dummy -> B -> C
> 
> blockdev-add of a filter we wish to eventually be between A and C:
>BB -> A -> dummy -> B -> C
> F/
> 
> if commit finishes before we issue block-insert-node (commit_complete()
> calls bdrv_set_backing_hd() which only touches A)
>BB -> A --> C
>   F -> dummy -> B /
>resulting in error when issuing block-insert-node,
> 
> else if we set the job to manual-delete and insert F:
>BB -> A -> F -> dummy -> B -> C
> deleting the job will result in F being lost since the job's top bs wasn't
> updated.

manual-delete is a solution for block jobs.  The write threshold feature
is a plain QMP command that in the future will create a transient
internal node (before write notifier).

I'm not sure it makes sense to turn the write threshold feature into a
block job.  I guess it could work, but it seems a little unnatural and
maybe there will be other features that use transient internal nodes.

What I'm getting at is that there might be alternative to manual-delete
that work in the general case.  For example, if blockdev-add +
block-insert-node are part of a QMP 'transaction' command, can we lock
the graph to protect it against modifications?


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-26 Thread Manos Pitsidianakis

On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:

On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:

This proposal follows a discussion we had with Kevin and Stefan on filter
node management.

With block filter drivers arises a need to configure filter nodes on runtime
with QMP on live graphs. A problem with doing live graph modifications is
that some block jobs modify the graph when they are done and don't operate
under any synchronisation, resulting in a race condition if we try to insert
a filter node in the place of an existing edge.


But maybe you are thinking about higher level race conditions between
QMP commands.  Can you give an example of the race?


Exactly, synchronisation between the QMP/user actions. Here's an 
example, correct me if I'm wrong:


User issues a block-commit to this tree to commit A onto B:
   BB -> A -> B

This inserts the dummy mirror node at the top since this is a mirror job 
(active commit)

   BB -> dummy -> A -> B

User wants to add a filter node F below the BB. First we create the 
node:

   BB -> dummy -> A -> B
  F /

Commit finishes before we do block-insert-node, dummy and A is removed 
from the BB's chain, mirror_exit calls bdrv_replace_node()


   BB > B
   F -> /

Now inserting F under BB will error since dummy does not exist any more.

The proposal doesn't guard against the following:
User issues a block-commit to this tree to commit B onto C:
   BB -> A -> B -> C

The dummy node (commit's top bs is A):
   BB -> A -> dummy -> B -> C

blockdev-add of a filter we wish to eventually be between A and C:
   BB -> A -> dummy -> B -> C
F/

if commit finishes before we issue block-insert-node (commit_complete()
calls bdrv_set_backing_hd() which only touches A)
   BB -> A --> C
  F -> dummy -> B /
   resulting in error when issuing block-insert-node,

else if we set the job to manual-delete and insert F:
   BB -> A -> F -> dummy -> B -> C
deleting the job will result in F being lost since the job's top bs 
wasn't updated.




signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete

2017-07-26 Thread Stefan Hajnoczi
On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
> This proposal follows a discussion we had with Kevin and Stefan on filter
> node management.
> 
> With block filter drivers arises a need to configure filter nodes on runtime
> with QMP on live graphs. A problem with doing live graph modifications is
> that some block jobs modify the graph when they are done and don't operate
> under any synchronisation, resulting in a race condition if we try to insert
> a filter node in the place of an existing edge.

Block jobs *do* operate under synchronization.  They only manipulate the
graph from the main loop (under the QEMU global mutex just like a
monitor command).

But maybe you are thinking about higher level race conditions between
QMP commands.  Can you give an example of the race?

CCing John Snow (interested in commands for 'reaping' completed block
jobs) and Jeff Cody (block jobs maintainer).


signature.asc
Description: PGP signature