subject:"\[Qemu\-devel\] Minutes from the \"Stuttgart block Gipfele\""

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2016-01-14 Thread Stefan Hajnoczi

On Mon, Jan 11, 2016 at 04:10:12PM +0100, Kevin Wolf wrote:
> Am 23.12.2015 um 09:33 hat Stefan Hajnoczi geschrieben:
> > On Fri, Dec 18, 2015 at 02:15:38PM +0100, Markus Armbruster wrote:
> > Another problem is that the backup block job and other operations that
> > require a single command today could require sequences of low-level
> > setup commands to create filter nodes.  The QMP client would need to
> > first create a write notifier filter and then start the backup block job
> > with the write notifier node name.  It's clumsy.
> 
> I don't think splitting it up into several low-level commands is
> necessary. We don't expect the user to set any options for the filter
> (and if we did, they would probably have to match the same options for
> the block job), so we can keep the BDS creation as a part of starting
> the block job.
> 
> The important part is that the management tool knows that a filter is
> going to be inserted and how to address it. In order to achieve that, we
> could simply add a 'filter-node-name' option to the QMP command starting
> the job.

That sounds like a good approach.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2016-01-11 Thread Kevin Wolf

Am 23.12.2015 um 09:33 hat Stefan Hajnoczi geschrieben:
> On Fri, Dec 18, 2015 at 02:15:38PM +0100, Markus Armbruster wrote:
> > What should happen when the user asks for a mutation at a place where we
> > have implicit filter(s)?
> 
> Please suspend your disbelief for a second:
> 
> In principle it's simplest not having implicit filters.  The client
> needs to set up throttling nodes or the backup filter explicitly.
> 
> Okay, now it's time to tear this apart:
> 
> For backwards compatibility it's necessary to support throttling,
> copy-on-read, backup notifier, etc.  It may be possible to tag implicit
> filter nodes so that mutation operations that wouldn't be possible today
> are rejected.  The client must use the explicit syntax to do mutations
> on implicit filters.  This is easier said than done, I'm not sure it can
> be implemented cleanly.

Yes, backwards compatibility is what complicates things a bit.

> Another problem is that the backup block job and other operations that
> require a single command today could require sequences of low-level
> setup commands to create filter nodes.  The QMP client would need to
> first create a write notifier filter and then start the backup block job
> with the write notifier node name.  It's clumsy.

I don't think splitting it up into several low-level commands is
necessary. We don't expect the user to set any options for the filter
(and if we did, they would probably have to match the same options for
the block job), so we can keep the BDS creation as a part of starting
the block job.

The important part is that the management tool knows that a filter is
going to be inserted and how to address it. In order to achieve that, we
could simply add a 'filter-node-name' option to the QMP command starting
the job.

Kevin

pgp5FS6ENXitq.pgp
Description: PGP signature

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2016-01-03 Thread Fam Zheng

On Mon, 01/04 13:16, Stefan Hajnoczi wrote:
> On Wed, Dec 23, 2015 at 06:15:20PM +0800, Fam Zheng wrote:
> > On Fri, 12/18 14:15, Markus Armbruster wrote:
> > In that theory, all other block job types, mirror/stream/commit, fit into a
> > "pull" model, which follows a specified dirty bitmap and copies data from a
> > specified src BDS. In this pull model,
> > 
> > mirror (device=d0 target=d1) becomes a pull fileter:
> > 
> > BB[d0]BB[d1]
> >| |
> > throttle[pull,src=d0]
> >| |
> >detect-zero   detect-zero
> >| |
> >   copy-on-read  copy-on-read
> >| |
> >   BDS   BDS
> > 
> > Note: the pull reuses most of the block/mirror.c code except the
> > s->dirty_bitmap will be initialized depending on the block job type. In the
> > case of mirror, it is trivially the same as now.
> 
> I don't understand the pull filter.  Is there also a mirror block job
> coroutine?
> 
> Does anything perform I/O to BB[d1]?

Yes, the filter will have a mirror block job coroutine, and it writes to the
BDS behind BB[d1]. This is conceptually different from the "block jobs have
their own BBs" design.

> 
> If nothing is writing to/reading from BB[d1], then I don't understand
> the purpose of the pull filter.
> 
> Stefan

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2016-01-03 Thread Stefan Hajnoczi

On Wed, Dec 23, 2015 at 06:15:20PM +0800, Fam Zheng wrote:
> On Fri, 12/18 14:15, Markus Armbruster wrote:
> In that theory, all other block job types, mirror/stream/commit, fit into a
> "pull" model, which follows a specified dirty bitmap and copies data from a
> specified src BDS. In this pull model,
> 
> mirror (device=d0 target=d1) becomes a pull fileter:
> 
> BB[d0]BB[d1]
>| |
> throttle[pull,src=d0]
>| |
>detect-zero   detect-zero
>| |
>   copy-on-read  copy-on-read
>| |
>   BDS   BDS
> 
> Note: the pull reuses most of the block/mirror.c code except the
> s->dirty_bitmap will be initialized depending on the block job type. In the
> case of mirror, it is trivially the same as now.

I don't understand the pull filter.  Is there also a mirror block job
coroutine?

Does anything perform I/O to BB[d1]?

If nothing is writing to/reading from BB[d1], then I don't understand
the purpose of the pull filter.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2015-12-23 Thread Fam Zheng

On Fri, 12/18 14:15, Markus Armbruster wrote:
> First, let's examine how such a chain could look like.  If we read the
> current code correctly, it behaves as if we had a chain
> 
> BB
>  |
>   throttle
>  |
>  detect-zero
>  |
> copy-on-read
>  |
> BDS
> 
> Except for the backup job, which behaves as if we had
> 
>backup job
>   /
>   notifier
>  |
>  detect-zero
>  |
> BDS

Just to brainstorm block jobs in the dynamic reconfigured node graph: (not sure
if this is useful)

Nothing stops us from viewing backup as a self-contained filter,

[backup]
   |
   detect-zero
   |
  BDS

where its .bdrv_co_writev copies out the old data, and at instantiation time
it also creates a long running coroutine (backup_run).

In that theory, all other block job types, mirror/stream/commit, fit into a
"pull" model, which follows a specified dirty bitmap and copies data from a
specified src BDS. In this pull model,

mirror (device=d0 target=d1) becomes a pull fileter:

BB[d0]BB[d1]
   | |
throttle[pull,src=d0]
   | |
   detect-zero   detect-zero
   | |
  copy-on-read  copy-on-read
   | |
  BDS   BDS

Note: the pull reuses most of the block/mirror.c code except the
s->dirty_bitmap will be initialized depending on the block job type. In the
case of mirror, it is trivially the same as now.

stream (device=d0 base=base0) becomes a pull filter:

  BB[d0]
   |
 [pull,src=base0]
   |
   detect-zero
   |
  copy-on-read
   |
  BDS
   |
  BDS[base0]

Note: s->dirty_bitmap will be initialized with the blocks which should be copied
by block-stream.

Similarly, active commit (device=d0 base=base0) becomes a pull filter:

  BB[d0]
   |
   detect-zero
   |
  copy-on-read
   |
  BDS
   |
  [pull,src=d0]
   |
  BDS[base0]

and commit (device=d0 top=base1 base=base0) becomes a pull filter:

  BB[d0]
   |
   detect-zero
   |
  copy-on-read
   |
  BDS
   |
  BDS[base1]
   |
  [pull,src=base1]
   |
  BDS[base0]

If this could work, I'm looking forward to a pretty looking diffstat if we can
unify the coroutine code of all four jobs. :)

Fam

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2015-12-23 Thread Stefan Hajnoczi

On Fri, Dec 18, 2015 at 02:15:38PM +0100, Markus Armbruster wrote:
> What should happen when the user asks for a mutation at a place where we
> have implicit filter(s)?

Please suspend your disbelief for a second:

In principle it's simplest not having implicit filters.  The client
needs to set up throttling nodes or the backup filter explicitly.

Okay, now it's time to tear this apart:

For backwards compatibility it's necessary to support throttling,
copy-on-read, backup notifier, etc.  It may be possible to tag implicit
filter nodes so that mutation operations that wouldn't be possible today
are rejected.  The client must use the explicit syntax to do mutations
on implicit filters.  This is easier said than done, I'm not sure it can
be implemented cleanly.

Another problem is that the backup block job and other operations that
require a single command today could require sequences of low-level
setup commands to create filter nodes.  The QMP client would need to
first create a write notifier filter and then start the backup block job
with the write notifier node name.  It's clumsy.

Stefan

signature.asc
Description: PGP signature

[Qemu-devel] Minutes from the "Stuttgart block Gipfele"

2015-12-18 Thread Markus Armbruster

Kevin, Max and I used an opportunity to meet and discuss block layer
matters.  We examined two topics in some depth: BlockBackend, and block
filters and dynamic reconfiguration.

Not nearly enough people to call it a block summit.  But the local
dialect is known for its use of diminutives, and "Gipfele" is the
diminutive of "summit" :)


= BlockBackend =

Background: BlockBackend (BB) was split off BlockDriverState (BDS) to
separate the block layer's external interface (BB) from its internal
building block (BDS).  Block layer clients such as device models and the
NBD server attach to a BB by BB name.  A BB has zero or one BDS (zero
means no medium).

Multiple device models using the same BB is dangerous, so we allow
attaching only one.  We don't currently enforce an "only one"
restriction for other clients.  This is problematic, because

* Different clients may want to configure the BB in conflicting ways,
  e.g. writeback caching mode (still to be moved from the BDS's
  enable_write_cache to the BB).

* When the BDS graph gets dynamically reconfigured, say when a block
  filter gets spliced in, clients that started out in the same spot may
  need to move differently.

Instead, each client should connect to its own BB.

This leads to the next question: how should this BB be created?

Initially, what is now the BB was mashed into the BDS.  In a way, the BB
got created along with the BDS.

The current code lets you create a BB along with a BDS when you need
one, or create a new BB for an existing BDS.  The BB has a name, and the
BDS may have a node-name.

The obvious low-level building blocks would be "create BB", "connect BB
to a BDS" (we have that as x-blockdev-insert-medium), "disconnect BB
from a BDS" (x-blockdev-remove-medium) and "destroy BB"
(x-blockdev-del).

Management applications probably don't mind having to work at this low
level, but for human users, it's cumbersome.  Perhaps the BB should be
created along with the client, at least optionally.

Means to create BBs separately are mostly useful when the BB needs to be
configured by the user: instead of duplicating the BB configuration
within each client, we keep it neatly separate.  We're not aware of
user-configurable knobs, though.

Currently, a client is configured to attach to a BB by specifying a BB
name.  For instance, a device model has a "drive" property that names a
BB.  If we create the BB automatically, we need client configuration to
name a BDS instead, i.e. we need a node-name instead of a BB name.

Of course, we'll have to keep the legacy configuration working.  The
"drive" property will have to refer to a BDS, like it did before BBs
were invented.  We could:

* Move the BB name back into the BDS.

* Move the BB name into DriveInfo, where the other legacy stuff lives.
  DriveInfo needs to be changed to hang off BDS rather than BB.

Regardless, dynamic reconfiguration may have to move the name / the
DriveInfo to a different BDS.

Not entirely sure automatic creation of BB is worthwhile or not.

Next steps:

* Support multiple BBs sharing the same BDS.

* Restrict BB to only one client of any kind instead of special-casing
  device models.

* Block jobs should go through BB.

* Investigate automatic creation of BB.


= Block filters =

We already have a few block filters:

* blkdebug, blkverify, quorum

Encryption should become another one.

Moreover, we have a few things mashed into BDS that should be filters:

* throttle (only at a root, i.e. right below a BB), copy-on-read,
  notifier (for backup block job), detect-zero

Dynamic reconfiguration means altering the BDS graph while it's in use.
Existing mutators:

* snaphot, mirror-complete, commit-complete, x-blockdev-change.

Things become interesting when nodes get implicitly inserted into the
graph, e.g.:

* A backup job inserts its notifier filter

* We create an implicit throttle filter to implement legacy throttling
  configuration

And so forth.  Nothing of the sort exists just yet.

What should happen when the user asks for a mutation at a place where we
have implicit filter(s)?

First, let's examine how such a chain could look like.  If we read the
current code correctly, it behaves as if we had a chain

BB
 |
  throttle
 |
 detect-zero
 |
copy-on-read
 |
BDS

Except for the backup job, which behaves as if we had

   backup job
  /
  notifier
 |
 detect-zero
 |
BDS

We believe that the following cleaned up filter stack should work:

BB
 |
  throttle  \
 |   \
copy-on-read  ) fixed at creation time
 |   /
 detect-zero/
 |
 | backup job
 |/
  notifier  ) dynamically inserted by the job
 |
BDS

Clients (device model, NBD server) connect through a BB on top.

Snapshot cuts in between the BDS and its implicit filters, like this:

BB

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

Re: [Qemu-devel] Minutes from the "Stuttgart block Gipfele"

[Qemu-devel] Minutes from the "Stuttgart block Gipfele"

7 matches

Site Navigation

Mail list logo

Footer information