Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-09 Thread Nir Soffer
On Thu, Jun 9, 2016 at 11:58 AM, Kevin Wolf  wrote:
> Am 08.06.2016 um 17:39 hat Nir Soffer geschrieben:
>> On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf  wrote:
>> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> >> Currently, we are trying to move the backing BDS from the source to the
>> >> target in bdrv_replace_in_backing_chain() which is called from
>> >> mirror_exit(). However, mirror_complete() already tries to open the
>> >> target's backing chain with a call to bdrv_open_backing_file().
>> >>
>> >> First, we should only set the target's backing BDS once. Second, the
>> >> mirroring block job has a better idea of what to set it to than the
>> >> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>> >> conditions on when to move the backing BDS from source to target are not
>> >> really correct).
>> >>
>> >> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>> >> leave it to mirror_complete().
>> >>
>> >> However, mirror_complete() in turn pursues a questionable strategy by
>> >> employing bdrv_open_backing_file(): On the one hand, because this may
>> >> open the wrong backing file with drive-mirror in "existing" mode, or
>> >> because it will not override a possibly wrong backing file in the
>> >> blockdev-mirror case.
>> >>
>> >> On the other hand, we want to reuse the existing backing chain of the
>> >> source instead of opening everything anew, because the latter results in
>> >> having multiple BDSs for a single physical file and thus potentially
>> >> concurrent access which we should try to avoid.
>> >
>> > Careful, this "wrong" backing file might actually be intended!
>> >
>> > Consider a case where you want to move an image with its whole backing
>> > chain to different storage. In that case, you would copy all of the
>> > backing files (cp is good enough, they are read-only), create the
>> > destination image which already points at the copied backing chain, and
>> > then mirror in "existing" mode.
>> >
>> > The intention is obviously that after the job completion the new backing
>> > chain is used and not the old one.
>> >
>> > I know that such cases were discussed when mirroring was introduced, I'm
>> > not sure whether it's actually used. We need some input there:
>> >
>> > Eric, can you tell us whether libvirt makes use of such a setup?
>> >
>> > Nir, I'm not sure who is the right person in oVirt these days, but do
>> > you either know yourself whether oVirt requires this to work, or do you
>> > know who else would know?
>>
>> I'm the right person, thanks for keeping me in the loop.
>>
>> What you describe is how we migrate a disk from one storage to another:
>>
>> 1. Create a vm snapshot
>> 2. Create a volume on the destination storage for the snapshot
>> 3. Start mirroring from the source snapshot to the destination snapshot
>> using libvirt virDomainBlockCopy:
>> https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy
>
> With VIR_DOMAIN_BLOCK_COPY_SHALLOW set, right? (That is, sync=top in QMP
> speech.)

Yes, actually we use:

VIR_DOMAIN_BLOCK_COPY_SHALLOW | VIR_DOMAIN_BLOCK_COPY_REUSE_EXT

>> 4. Copy the reset of the chain from source to destination using qemu-img 
>> convert
>> 5. Pivot to the new chain using libvirt virDomainBlockJobAbort
>> 
>> https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort
>> 6. Remove the old chain
>>
>> source and target can be files or block device, and we plan to support also
>> rbd and gluster volumes as target, maybe also as source.
>
> Thanks, Nir, we should then do our best not to break it.
>
> Max, maybe we can add a qemu-iotests case that does the exact same thing
> as oVirt does?
>
> Kevin



Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-08 Thread Nir Soffer
On Wed, Jun 8, 2016 at 12:32 PM, Kevin Wolf  wrote:
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>> Currently, we are trying to move the backing BDS from the source to the
>> target in bdrv_replace_in_backing_chain() which is called from
>> mirror_exit(). However, mirror_complete() already tries to open the
>> target's backing chain with a call to bdrv_open_backing_file().
>>
>> First, we should only set the target's backing BDS once. Second, the
>> mirroring block job has a better idea of what to set it to than the
>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>> conditions on when to move the backing BDS from source to target are not
>> really correct).
>>
>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>> leave it to mirror_complete().
>>
>> However, mirror_complete() in turn pursues a questionable strategy by
>> employing bdrv_open_backing_file(): On the one hand, because this may
>> open the wrong backing file with drive-mirror in "existing" mode, or
>> because it will not override a possibly wrong backing file in the
>> blockdev-mirror case.
>>
>> On the other hand, we want to reuse the existing backing chain of the
>> source instead of opening everything anew, because the latter results in
>> having multiple BDSs for a single physical file and thus potentially
>> concurrent access which we should try to avoid.
>
> Careful, this "wrong" backing file might actually be intended!
>
> Consider a case where you want to move an image with its whole backing
> chain to different storage. In that case, you would copy all of the
> backing files (cp is good enough, they are read-only), create the
> destination image which already points at the copied backing chain, and
> then mirror in "existing" mode.
>
> The intention is obviously that after the job completion the new backing
> chain is used and not the old one.
>
> I know that such cases were discussed when mirroring was introduced, I'm
> not sure whether it's actually used. We need some input there:
>
> Eric, can you tell us whether libvirt makes use of such a setup?
>
> Nir, I'm not sure who is the right person in oVirt these days, but do
> you either know yourself whether oVirt requires this to work, or do you
> know who else would know?

I'm the right person, thanks for keeping me in the loop.

What you describe is how we migrate a disk from one storage to another:

1. Create a vm snapshot
2. Create a volume on the destination storage for the snapshot
3. Start mirroring from the source snapshot to the destination snapshot
using libvirt virDomainBlockCopy:
https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy
4. Copy the reset of the chain from source to destination using qemu-img convert
5. Pivot to the new chain using libvirt virDomainBlockJobAbort
https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockJobAbort
6. Remove the old chain

source and target can be files or block device, and we plan to support also
rbd and gluster volumes as target, maybe also as source.

Nir

>
>> Thus, instead of invoking bdrv_open_backing_file(), just set the correct
>> backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
>> mirror_complete() is certain to succeed.
>>
>> In contrast to what bdrv_replace_in_backing_chain() did so far, we do
>> not need to drop the source's backing file.
>>
>> Signed-off-by: Max Reitz 
>
> Leaving the actual code review for later when we have decided what
> semantics we even want.
>
> Kevin



Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-08 Thread Max Reitz
On 08.06.2016 13:28, Paolo Bonzini wrote:
> 
> 
> - Original Message -
>> From: "Kevin Wolf" 
>> To: "Max Reitz" 
>> Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" 
>> , nsof...@redhat.com,
>> ebl...@redhat.com, pbonz...@redhat.com
>> Sent: Wednesday, June 8, 2016 11:32:29 AM
>> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
>>
>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
>>> Currently, we are trying to move the backing BDS from the source to the
>>> target in bdrv_replace_in_backing_chain() which is called from
>>> mirror_exit(). However, mirror_complete() already tries to open the
>>> target's backing chain with a call to bdrv_open_backing_file().
>>>
>>> First, we should only set the target's backing BDS once. Second, the
>>> mirroring block job has a better idea of what to set it to than the
>>> generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
>>> conditions on when to move the backing BDS from source to target are not
>>> really correct).
>>>
>>> Therefore, remove that code from bdrv_replace_in_backing_chain() and
>>> leave it to mirror_complete().
>>>
>>> However, mirror_complete() in turn pursues a questionable strategy by
>>> employing bdrv_open_backing_file(): On the one hand, because this may
>>> open the wrong backing file with drive-mirror in "existing" mode, or
>>> because it will not override a possibly wrong backing file in the
>>> blockdev-mirror case.
>>
>> Careful, this "wrong" backing file might actually be intended!
>>
>> Consider a case where you want to move an image with its whole backing
>> chain to different storage. In that case, you would copy all of the
>> backing files (cp is good enough, they are read-only), create the
>> destination image which already points at the copied backing chain, and
>> then mirror in "existing" mode.
>>
>> The intention is obviously that after the job completion the new backing
>> chain is used and not the old one.
> 
> Yes, this is the intention and it should not be changed.  In addition
> to what Kevin said, you can use drive-mirror to collapse the image to a
> single file; in this case, QEMU should not be using the backing files of
> the source.

That is an issue that we have right now. If you do drive-mirror in
absolute-paths mode with sync=full, the target will have the backing
chain of the source. This is something that this patch fixes.

In fact, I think if you do drive-mirror in existing mode or
blockdev-mirror and the target image does not have a backing file
(whatever sync mode you have used), the same will happen.

Max

> bdrv_open_backing_file() is used because what we want to do is to
> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
> 
> If the contents change under the guest feet, it's the layers above
> QEMU that have screwed up.
> 
> Paolo
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-08 Thread Max Reitz
On 08.06.2016 16:40, Max Reitz wrote:
> On 08.06.2016 13:28, Paolo Bonzini wrote:
>>
>>
>> - Original Message -
>>> From: "Kevin Wolf" 
>>> To: "Max Reitz" 
>>> Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" 
>>> , nsof...@redhat.com,
>>> ebl...@redhat.com, pbonz...@redhat.com
>>> Sent: Wednesday, June 8, 2016 11:32:29 AM
>>> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
>>>
>>> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
 Currently, we are trying to move the backing BDS from the source to the
 target in bdrv_replace_in_backing_chain() which is called from
 mirror_exit(). However, mirror_complete() already tries to open the
 target's backing chain with a call to bdrv_open_backing_file().

 First, we should only set the target's backing BDS once. Second, the
 mirroring block job has a better idea of what to set it to than the
 generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
 conditions on when to move the backing BDS from source to target are not
 really correct).

 Therefore, remove that code from bdrv_replace_in_backing_chain() and
 leave it to mirror_complete().

 However, mirror_complete() in turn pursues a questionable strategy by
 employing bdrv_open_backing_file(): On the one hand, because this may
 open the wrong backing file with drive-mirror in "existing" mode, or
 because it will not override a possibly wrong backing file in the
 blockdev-mirror case.
>>>
>>> Careful, this "wrong" backing file might actually be intended!
>>>
>>> Consider a case where you want to move an image with its whole backing
>>> chain to different storage. In that case, you would copy all of the
>>> backing files (cp is good enough, they are read-only), create the
>>> destination image which already points at the copied backing chain, and
>>> then mirror in "existing" mode.
>>>
>>> The intention is obviously that after the job completion the new backing
>>> chain is used and not the old one.
>>
>> Yes, this is the intention and it should not be changed.  In addition
>> to what Kevin said, you can use drive-mirror to collapse the image to a
>> single file; in this case, QEMU should not be using the backing files of
>> the source.
> 
> That is an issue that we have right now. If you do drive-mirror in
> absolute-paths mode with sync=full, the target will have the backing
> chain of the source. This is something that this patch fixes.

As a clarification: I mean the backing chain inside QEMU (in the BDS
graph), not the on-disk backing chain, i.e. how the physical image files
link to each other.

Max

> In fact, I think if you do drive-mirror in existing mode or
> blockdev-mirror and the target image does not have a backing file
> (whatever sync mode you have used), the same will happen.
> 
> Max
> 
>> bdrv_open_backing_file() is used because what we want to do is to
>> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
>>
>> If the contents change under the guest feet, it's the layers above
>> QEMU that have screwed up.
>>
>> Paolo
>>
> 
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-08 Thread Kevin Wolf
Am 08.06.2016 um 13:28 hat Paolo Bonzini geschrieben:
> 
> 
> - Original Message -
> > From: "Kevin Wolf" 
> > To: "Max Reitz" 
> > Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" 
> > , nsof...@redhat.com,
> > ebl...@redhat.com, pbonz...@redhat.com
> > Sent: Wednesday, June 8, 2016 11:32:29 AM
> > Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
> > 
> > Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> > > Currently, we are trying to move the backing BDS from the source to the
> > > target in bdrv_replace_in_backing_chain() which is called from
> > > mirror_exit(). However, mirror_complete() already tries to open the
> > > target's backing chain with a call to bdrv_open_backing_file().
> > > 
> > > First, we should only set the target's backing BDS once. Second, the
> > > mirroring block job has a better idea of what to set it to than the
> > > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> > > conditions on when to move the backing BDS from source to target are not
> > > really correct).
> > > 
> > > Therefore, remove that code from bdrv_replace_in_backing_chain() and
> > > leave it to mirror_complete().
> > > 
> > > However, mirror_complete() in turn pursues a questionable strategy by
> > > employing bdrv_open_backing_file(): On the one hand, because this may
> > > open the wrong backing file with drive-mirror in "existing" mode, or
> > > because it will not override a possibly wrong backing file in the
> > > blockdev-mirror case.
> > 
> > Careful, this "wrong" backing file might actually be intended!
> > 
> > Consider a case where you want to move an image with its whole backing
> > chain to different storage. In that case, you would copy all of the
> > backing files (cp is good enough, they are read-only), create the
> > destination image which already points at the copied backing chain, and
> > then mirror in "existing" mode.
> > 
> > The intention is obviously that after the job completion the new backing
> > chain is used and not the old one.
> 
> Yes, this is the intention and it should not be changed.  In addition
> to what Kevin said, you can use drive-mirror to collapse the image to a
> single file; in this case, QEMU should not be using the backing files of
> the source.
> 
> bdrv_open_backing_file() is used because what we want to do is to
> "undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.
> 
> If the contents change under the guest feet, it's the layers above
> QEMU that have screwed up.

We should probably have test cases for both scenarios. They would make
it obvious that changing this behaviour is not okay. Actually, I'm
surprised that our existing cases don't seem to cover this.

Kevin



Re: [Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-08 Thread Paolo Bonzini


- Original Message -
> From: "Kevin Wolf" 
> To: "Max Reitz" 
> Cc: qemu-block@nongnu.org, qemu-de...@nongnu.org, "Fam Zheng" 
> , nsof...@redhat.com,
> ebl...@redhat.com, pbonz...@redhat.com
> Sent: Wednesday, June 8, 2016 11:32:29 AM
> Subject: Re: [PATCH v2 2/3] block/mirror: Fix target backing BDS
> 
> Am 06.06.2016 um 16:42 hat Max Reitz geschrieben:
> > Currently, we are trying to move the backing BDS from the source to the
> > target in bdrv_replace_in_backing_chain() which is called from
> > mirror_exit(). However, mirror_complete() already tries to open the
> > target's backing chain with a call to bdrv_open_backing_file().
> > 
> > First, we should only set the target's backing BDS once. Second, the
> > mirroring block job has a better idea of what to set it to than the
> > generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
> > conditions on when to move the backing BDS from source to target are not
> > really correct).
> > 
> > Therefore, remove that code from bdrv_replace_in_backing_chain() and
> > leave it to mirror_complete().
> > 
> > However, mirror_complete() in turn pursues a questionable strategy by
> > employing bdrv_open_backing_file(): On the one hand, because this may
> > open the wrong backing file with drive-mirror in "existing" mode, or
> > because it will not override a possibly wrong backing file in the
> > blockdev-mirror case.
> 
> Careful, this "wrong" backing file might actually be intended!
> 
> Consider a case where you want to move an image with its whole backing
> chain to different storage. In that case, you would copy all of the
> backing files (cp is good enough, they are read-only), create the
> destination image which already points at the copied backing chain, and
> then mirror in "existing" mode.
> 
> The intention is obviously that after the job completion the new backing
> chain is used and not the old one.

Yes, this is the intention and it should not be changed.  In addition
to what Kevin said, you can use drive-mirror to collapse the image to a
single file; in this case, QEMU should not be using the backing files of
the source.

bdrv_open_backing_file() is used because what we want to do is to
"undo" the BDRV_O_NO_BACKING flag used by qmp_drive_mirror.

If the contents change under the guest feet, it's the layers above
QEMU that have screwed up.

Paolo



[Qemu-block] [PATCH v2 2/3] block/mirror: Fix target backing BDS

2016-06-06 Thread Max Reitz
Currently, we are trying to move the backing BDS from the source to the
target in bdrv_replace_in_backing_chain() which is called from
mirror_exit(). However, mirror_complete() already tries to open the
target's backing chain with a call to bdrv_open_backing_file().

First, we should only set the target's backing BDS once. Second, the
mirroring block job has a better idea of what to set it to than the
generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
conditions on when to move the backing BDS from source to target are not
really correct).

Therefore, remove that code from bdrv_replace_in_backing_chain() and
leave it to mirror_complete().

However, mirror_complete() in turn pursues a questionable strategy by
employing bdrv_open_backing_file(): On the one hand, because this may
open the wrong backing file with drive-mirror in "existing" mode, or
because it will not override a possibly wrong backing file in the
blockdev-mirror case.

On the other hand, we want to reuse the existing backing chain of the
source instead of opening everything anew, because the latter results in
having multiple BDSs for a single physical file and thus potentially
concurrent access which we should try to avoid.

Thus, instead of invoking bdrv_open_backing_file(), just set the correct
backing BDS directly via bdrv_set_backing_hd(). Also, do so only when
mirror_complete() is certain to succeed.

In contrast to what bdrv_replace_in_backing_chain() did so far, we do
not need to drop the source's backing file.

Signed-off-by: Max Reitz 
---
 block.c|  8 
 block/mirror.c | 21 +
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 16463aa..792f5dd 100644
--- a/block.c
+++ b/block.c
@@ -2288,14 +2288,6 @@ void bdrv_replace_in_backing_chain(BlockDriverState 
*old, BlockDriverState *new)
 
 change_parent_backing_link(old, new);
 
-/* Change backing files if a previously independent node is added to the
- * chain. For active commit, we replace top by its own (indirect) backing
- * file and don't do anything here so we don't build a loop. */
-if (new->backing == NULL && !bdrv_chain_contains(backing_bs(old), new)) {
-bdrv_set_backing_hd(new, backing_bs(old));
-bdrv_set_backing_hd(old, NULL);
-}
-
 bdrv_unref(old);
 }
 
diff --git a/block/mirror.c b/block/mirror.c
index 80fd3c7..217475b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -742,15 +742,11 @@ static void mirror_set_speed(BlockJob *job, int64_t 
speed, Error **errp)
 static void mirror_complete(BlockJob *job, Error **errp)
 {
 MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
-Error *local_err = NULL;
-int ret;
+BlockDriverState *src, *target;
+
+src = blk_bs(job->blk);
+target = blk_bs(s->target);
 
-ret = bdrv_open_backing_file(blk_bs(s->target), NULL, "backing",
- _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return;
-}
 if (!s->synced) {
 error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id);
 return;
@@ -777,6 +773,15 @@ static void mirror_complete(BlockJob *job, Error **errp)
 aio_context_release(replace_aio_context);
 }
 
+/* Now we need to adjust the target's backing BDS. This is not necessary
+ * when having performed a commit operation. */
+if (!bdrv_chain_contains(backing_bs(src), target)) {
+BlockDriverState *backing = s->is_none_mode ? src : s->base;
+if (backing_bs(target) != backing) {
+bdrv_set_backing_hd(target, backing);
+}
+}
+
 s->should_complete = true;
 block_job_enter(>common);
 }
-- 
2.8.3