On 03/01/2012 08:36 AM, Paolo Bonzini wrote:
> Il 01/03/2012 16:13, Federico Simoncelli ha scritto:
>> Signed-off-by: Federico Simoncelli <fsimo...@redhat.com>
>> ---
>>  blockdev.c       |    8 ++++++++
>>  qapi-schema.json |   12 ++++++++++++
>>  qmp-commands.hx  |    6 +++++-
>>  3 files changed, 25 insertions(+), 1 deletions(-)
>>

>> @@ -734,6 +734,10 @@ actions array:
>>        - "format": format of new image (json-string, optional)
>>        - "reuse": whether QEMU should look for an existing image file
>>          (json-bool, optional, default false)
>> +      When "type" is "reopen":
>> +      - "device": device name to reopen (json-string)
>> +      - "target": name of destination image file (json-string)
>> +      - "format": format of new image (json-string, optional)
>>  
>>  Example:
>>  
> 
> This keeps the whole "tower" of backing_hds from the old block device,
> right? (including the blkmirror!)  I think that instead you need to
> modify bdrv_append, probably by saving the action kind in
> BlkTransactionStates.

Do we need more flexibility in what is being reopened?

I see several possibilities.  In the examples below, I'm using '*' for
any file that qemu has an open fd for, and [] to show the contents of a
backing file field within that file (which can be relative or absolute).

1. post-copy, oVirt creates the snapshot with a relative backing file
name.  That is, when using blockdev-transaction with the 'reuse' flag
set to true, qemu is merely opening the new file, ignoring the backing
file name present in the file it just opens, and using the current open
fd as the backing file of the just-opened fd.  That would give this setup:

> */store1/base <- */store1/snap1[backed by 'base'] <-\
>  /store2/base <-  /store2/snap1[backed by 'base']    \- */store2/snap2[backed 
> by 'snap1'] <- qemu

and the action of the reopen will actually follow the relative backing
file name.  That is, 'snap2' did not technically have to be reopened, so
much as 'snap1'; but the act of reopening the entire 'snap2' chain
causes the swap so that all open files are now from /store2.

>  /store1/base <-  /store1/snap1[backed by 'base']
> */store2/base <- */store2/snap1[backed by 'base'] <- */store2/snap2[backed by 
> 'snap1'] <- qemu

2. post-copy, letting qemu create the snapshot with absolute backing
file name.  We are starting with:

> */store1/base <- */store1/snap1[backed by '/store1/base'] <-\
>  /store2/base <-  /store2/snap1[backed by '/store2/base']    \- 
> */store2/snap2[backed by '/store1/snap1'] <- qemu

and here, we actually need to rewrite the snap2 file to point to a
different backing file.  Again, we don't technically need to reopen
'snap2', so much as we need to reopen a different 'snap1' as well as
rewrite the backing file of snap2, to end with:

>  /store1/base <-  /store1/snap1[backed by '/store1/base']
> */store2/base <- */store2/snap1[backed by '/store2/base'] <- 
> */store2/snap2[backed by '/store2/snap1'] <- qemu

3. mirrored migration, where oVirt pre-creates the files with relative
backing names.  qemu doesn't open the full chain of the mirror file, so
we are starting with:

> */store1/base <- */store1/snap1[backed by 'base'] <-+-- */store1/snap2[backed 
> by 'snap1'] <- qemu r/w
>  /store2/base <-  /store2/snap1[backed by 'base']    \- */store2/snap2[backed 
> by 'snap1'] <- qemu write only

and here, we really want the reopen to drop the side of the mirror that
we are no longer using, as well as to follow the full chain of the side
that we are swapping to.  Again, it is not technically necessary to
reopen store2/snap2 (which is already open), so much as to open the
backing chain, to end with:

>  /store1/base <-  /store1/snap1[backed by 'base'] <-  /store1/snap2[backed by 
> 'snap1']
> */store2/base <- */store2/snap1[backed by 'base'] <- */store2/snap2[backed by 
> 'snap1'] <- qemu r/w

4. mirrored migration, where qemu creates the snapshot and mirror with
absolute backing names.  We are starting with:

> */store1/base <- */store1/snap1[backed by '/store1/base'] <-+-- 
> */store1/snap2[backed by '/store1/snap1'] <- qemu r/w
>  /store2/base <-  /store2/snap1[backed by '/store2/base']    \- 
> */store2/snap2[backed by '/store1/snap1'] <- qemu write only

and just as in scenario 2, we want to rewrite the backing file stored in
store2/snap2 as part of the reopen, to end with:

>  /store1/base <-  /store1/snap1[backed by '/store1/base'] <-  
> /store1/snap2[backed by '/store1/snap1']
> */store2/base <- */store2/snap1[backed by '/store2/base'] <- 
> */store2/snap2[backed by '/store2/snap1'] <- qemu r/w


Notice that in all 4 cases, the right file in current use by qemu was
already open; what this operation is really doing is reopening an
updated backing chain, as well as possibly discarding a mirror and/or
rewriting the backing file of the top level file as part of opening the
new backing chain.  I think we may need an optional argument that says
that when reopening a file, we also want to rewrite its backing chain to
the specified file name and type, as in:

{ 'type': 'BlockdevReopen',
  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
            '*backing': 'str', '*backing-format', 'str' } }

Swap 'device' to use the reopened 'target' (with given 'format') as its
current file, as well as reopening the entire backing file chain of
'target'.  This discards any mirror files that were previously in use.
Optionally, if 'backing' is given, rewrite 'target' to use 'backing'
rather than using the backing file currently recorded in target.

> But you can even keep from your first patch the drive-reopen command and
> not make it atomic, that shouldn't be a problem.

I'm not sure whether it makes sense for a separate drive-reopen or
whether to just add this to blockdev-transaction (or even both); I can
make libvirt use whichever color bikeshed we pick.  There's definitely a
transaction aspect here, though - we are reopening _all_ files along the
backing chain, so we must be able to roll back without rewriting any
backing file or altering the blockdev stack of in-use files if any of
the opens fail.

-- 
Eric Blake   ebl...@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to