Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-09-09 Thread Max Reitz
On 03.09.19 10:06, Vladimir Sementsov-Ogievskiy wrote:
> 02.09.2019 19:34, Max Reitz wrote:

[...]

>> I’m not saying that we need to abandon having BBs right now, but I think
>> there are a couple of cases which show why I say it’s uglier than using
>> BdrvChildren instead.
>>
> 
> OK. I'd prefer to move block-copy to BdrvChildren as follow-up, if you don't 
> mind.

Yep, sure, that’s A-OK with me.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-09-03 Thread Vladimir Sementsov-Ogievskiy
02.09.2019 19:34, Max Reitz wrote:
> On 29.08.19 16:55, Vladimir Sementsov-Ogievskiy wrote:
>> 28.08.2019 22:50, Max Reitz wrote:
>>> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
 Drop write notifiers and use filter node instead.

 = Changes =

 1. add filter-node-name argument for backup qmp api. We have to do it
 in this commit, as 257 needs to be fixed.
>>>
>>> I feel a bit bad about it not being an implicit node.  But I know you
>>> don’t like them, they’re probably just broken altogether and because
>>> libvirt doesn’t use backup (yet), probably nobody cares.
>>>
 2. there no move write notifiers here, so is_write_notifier parameter
>>>
>>> s/there/there are/, I suppose?
>>>
 is dropped from block-copy paths.

 3. Intersecting requests handling changed, now synchronization between
 backup-top, backup and guest writes are all done in block/block-copy.c
 and works as follows:

 On copy operation, we work only with dirty areas. If bits are dirty it
 means that there are no requests intersecting with this area. We clear
 dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
 prevent further operations from interaction with guest (only with
 guest, as neither backup nor backup-top will touch non-dirty area). If
 copy-operation failed we set dirty bits back together with releasing
 the lock.

 The actual difference with old scheme is that on guest writes we
 don't lock the whole region but only dirty-parts, and to be more
 precise: only dirty-part we are currently operate on. In old scheme
 guest write to non-dirty area (which may be safely ignored by backup)
 may wait for intersecting request, touching some other area which is
 dirty.

 4. To sync with in-flight requests at job finish we now have drained
 removing of the filter, we don't need rw-lock.

 = Notes =

 Note the consequence of three objects appearing: backup-top, backup job
 and block-copy-state:

 1. We want to insert backup-top before job creation, to behave similar
 with mirror and commit, where job is started upon filter.

 2. We also have to create block-copy-state after filter injection, as
 we don't want it's source child be replaced by fitler. Instead we want
>>>
>>> s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
>>> ring to it)
>>>
 to keep BCS.source to be real source node, as we want to use
 bdrv_co_try_lock in CBW operations and it can't be used on filter, as
 on filter we already have in-flight (write) request from upper layer.
>>>
>>> Reasonable, even more so as I suppose BCS.source should eventually be a
>>> BdrvChild of backup-top.
>>>
>>> What looks wrong is that the sync_bitmap is created on the source (“bs”
>>> in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
>>> it is on blk_bs(job->common.blk) (which is no longer true).
>>>
 So, we firstly create inject backup-top, then create job and BCS. BCS
 is the latest just to not create extra variable for it. Finally we set
 bcs for backup-top filter.

 = Iotest changes =

 56: op-blocker doesn't shot now, as we set it on source, but then check
>>>
>>> s/shot/show/?
>>>
 on filter, when trying to start second backup, so error caught in
 test_dismiss_collision is changed. It's OK anyway, as this test-case
 seems to test that after some collision we can dismiss first job and
 successfully start the second one. So, the change is that collision is
 changed from op-blocker to file-posix locks.
>>>
>>> Well, but the op blocker belongs to the source, which should be covered
>>> by internal permissions.  The fact that it now shows up as a file-posix
>>> error thus shows that the conflict also moves from the source to the
>>> target.  It’s OK because we actually don’t have a conflict on the source.
>>>
>>> But I wonder whether it would be better for test_dismiss_collision() to
>>> do a blockdev-backup instead where we can see the collision on the target.
>>>
>>> Hm.  On second thought, why do we even get a conflict on the target?
>>> block-copy does share the WRITE permission for it...
>>
>> Not sure, but assume that this is because in file-posix.c in raw_co_create
>> we do want RESIZE perm.
>>
>> I can instead move this test to use specified job-id, to move the collision
>> to "job-id already exists" error. Is it better?
> 
> It’s probably the only thing that actually makes sense.
> 
>> I'm afraid that posix locks will not work if disable them in config.
> 
> Yes (or if the environment doesn’t support them); which is why I
> suggested using blockdev-backup.  (But that would have the same problem
> of “Where does the permission conflict even come from and is it
> inevitable?”)
> 
> [...]
> 
 diff --git a/block/block-copy.c b/block/block-copy.c
 index 6828c46ba0..f3102ec3ff 100644
 --- 

Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-09-02 Thread Max Reitz
On 29.08.19 16:55, Vladimir Sementsov-Ogievskiy wrote:
> 28.08.2019 22:50, Max Reitz wrote:
>> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
>>> Drop write notifiers and use filter node instead.
>>>
>>> = Changes =
>>>
>>> 1. add filter-node-name argument for backup qmp api. We have to do it
>>> in this commit, as 257 needs to be fixed.
>>
>> I feel a bit bad about it not being an implicit node.  But I know you
>> don’t like them, they’re probably just broken altogether and because
>> libvirt doesn’t use backup (yet), probably nobody cares.
>>
>>> 2. there no move write notifiers here, so is_write_notifier parameter
>>
>> s/there/there are/, I suppose?
>>
>>> is dropped from block-copy paths.
>>>
>>> 3. Intersecting requests handling changed, now synchronization between
>>> backup-top, backup and guest writes are all done in block/block-copy.c
>>> and works as follows:
>>>
>>> On copy operation, we work only with dirty areas. If bits are dirty it
>>> means that there are no requests intersecting with this area. We clear
>>> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
>>> prevent further operations from interaction with guest (only with
>>> guest, as neither backup nor backup-top will touch non-dirty area). If
>>> copy-operation failed we set dirty bits back together with releasing
>>> the lock.
>>>
>>> The actual difference with old scheme is that on guest writes we
>>> don't lock the whole region but only dirty-parts, and to be more
>>> precise: only dirty-part we are currently operate on. In old scheme
>>> guest write to non-dirty area (which may be safely ignored by backup)
>>> may wait for intersecting request, touching some other area which is
>>> dirty.
>>>
>>> 4. To sync with in-flight requests at job finish we now have drained
>>> removing of the filter, we don't need rw-lock.
>>>
>>> = Notes =
>>>
>>> Note the consequence of three objects appearing: backup-top, backup job
>>> and block-copy-state:
>>>
>>> 1. We want to insert backup-top before job creation, to behave similar
>>> with mirror and commit, where job is started upon filter.
>>>
>>> 2. We also have to create block-copy-state after filter injection, as
>>> we don't want it's source child be replaced by fitler. Instead we want
>>
>> s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
>> ring to it)
>>
>>> to keep BCS.source to be real source node, as we want to use
>>> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
>>> on filter we already have in-flight (write) request from upper layer.
>>
>> Reasonable, even more so as I suppose BCS.source should eventually be a
>> BdrvChild of backup-top.
>>
>> What looks wrong is that the sync_bitmap is created on the source (“bs”
>> in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
>> it is on blk_bs(job->common.blk) (which is no longer true).
>>
>>> So, we firstly create inject backup-top, then create job and BCS. BCS
>>> is the latest just to not create extra variable for it. Finally we set
>>> bcs for backup-top filter.
>>>
>>> = Iotest changes =
>>>
>>> 56: op-blocker doesn't shot now, as we set it on source, but then check
>>
>> s/shot/show/?
>>
>>> on filter, when trying to start second backup, so error caught in
>>> test_dismiss_collision is changed. It's OK anyway, as this test-case
>>> seems to test that after some collision we can dismiss first job and
>>> successfully start the second one. So, the change is that collision is
>>> changed from op-blocker to file-posix locks.
>>
>> Well, but the op blocker belongs to the source, which should be covered
>> by internal permissions.  The fact that it now shows up as a file-posix
>> error thus shows that the conflict also moves from the source to the
>> target.  It’s OK because we actually don’t have a conflict on the source.
>>
>> But I wonder whether it would be better for test_dismiss_collision() to
>> do a blockdev-backup instead where we can see the collision on the target.
>>
>> Hm.  On second thought, why do we even get a conflict on the target?
>> block-copy does share the WRITE permission for it...
> 
> Not sure, but assume that this is because in file-posix.c in raw_co_create
> we do want RESIZE perm.
> 
> I can instead move this test to use specified job-id, to move the collision
> to "job-id already exists" error. Is it better?

It’s probably the only thing that actually makes sense.

> I'm afraid that posix locks will not work if disable them in config.

Yes (or if the environment doesn’t support them); which is why I
suggested using blockdev-backup.  (But that would have the same problem
of “Where does the permission conflict even come from and is it
inevitable?”)

[...]

>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 6828c46ba0..f3102ec3ff 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -98,27 +98,32 @@ fail:
>>>* error occurred, return a negative error number
>>>*/
>>>   static int 

Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-08-29 Thread Vladimir Sementsov-Ogievskiy
28.08.2019 22:50, Max Reitz wrote:
> On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
>> Drop write notifiers and use filter node instead.
>>
>> = Changes =
>>
>> 1. add filter-node-name argument for backup qmp api. We have to do it
>> in this commit, as 257 needs to be fixed.
> 
> I feel a bit bad about it not being an implicit node.  But I know you
> don’t like them, they’re probably just broken altogether and because
> libvirt doesn’t use backup (yet), probably nobody cares.
> 
>> 2. there no move write notifiers here, so is_write_notifier parameter
> 
> s/there/there are/, I suppose?
> 
>> is dropped from block-copy paths.
>>
>> 3. Intersecting requests handling changed, now synchronization between
>> backup-top, backup and guest writes are all done in block/block-copy.c
>> and works as follows:
>>
>> On copy operation, we work only with dirty areas. If bits are dirty it
>> means that there are no requests intersecting with this area. We clear
>> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
>> prevent further operations from interaction with guest (only with
>> guest, as neither backup nor backup-top will touch non-dirty area). If
>> copy-operation failed we set dirty bits back together with releasing
>> the lock.
>>
>> The actual difference with old scheme is that on guest writes we
>> don't lock the whole region but only dirty-parts, and to be more
>> precise: only dirty-part we are currently operate on. In old scheme
>> guest write to non-dirty area (which may be safely ignored by backup)
>> may wait for intersecting request, touching some other area which is
>> dirty.
>>
>> 4. To sync with in-flight requests at job finish we now have drained
>> removing of the filter, we don't need rw-lock.
>>
>> = Notes =
>>
>> Note the consequence of three objects appearing: backup-top, backup job
>> and block-copy-state:
>>
>> 1. We want to insert backup-top before job creation, to behave similar
>> with mirror and commit, where job is started upon filter.
>>
>> 2. We also have to create block-copy-state after filter injection, as
>> we don't want it's source child be replaced by fitler. Instead we want
> 
> s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
> ring to it)
> 
>> to keep BCS.source to be real source node, as we want to use
>> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
>> on filter we already have in-flight (write) request from upper layer.
> 
> Reasonable, even more so as I suppose BCS.source should eventually be a
> BdrvChild of backup-top.
> 
> What looks wrong is that the sync_bitmap is created on the source (“bs”
> in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
> it is on blk_bs(job->common.blk) (which is no longer true).
> 
>> So, we firstly create inject backup-top, then create job and BCS. BCS
>> is the latest just to not create extra variable for it. Finally we set
>> bcs for backup-top filter.
>>
>> = Iotest changes =
>>
>> 56: op-blocker doesn't shot now, as we set it on source, but then check
> 
> s/shot/show/?
> 
>> on filter, when trying to start second backup, so error caught in
>> test_dismiss_collision is changed. It's OK anyway, as this test-case
>> seems to test that after some collision we can dismiss first job and
>> successfully start the second one. So, the change is that collision is
>> changed from op-blocker to file-posix locks.
> 
> Well, but the op blocker belongs to the source, which should be covered
> by internal permissions.  The fact that it now shows up as a file-posix
> error thus shows that the conflict also moves from the source to the
> target.  It’s OK because we actually don’t have a conflict on the source.
> 
> But I wonder whether it would be better for test_dismiss_collision() to
> do a blockdev-backup instead where we can see the collision on the target.
> 
> Hm.  On second thought, why do we even get a conflict on the target?
> block-copy does share the WRITE permission for it...

Not sure, but assume that this is because in file-posix.c in raw_co_create
we do want RESIZE perm.

I can instead move this test to use specified job-id, to move the collision
to "job-id already exists" error. Is it better?

I'm afraid that posix locks will not work if disable them in config.

> 
>> However, it's obvious now that we'd better drop this op-blocker at all
>> and add a test-case for two backups from one node (to different
>> destinations) actually works. But not in these series.
>>
>> 257: The test wants to emulate guest write during backup. They should
>> go to filter node, not to original source node, of course. Therefore we
>> need to specify filter node name and use it.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   qapi/block-core.json   |   8 +-
>>   include/block/block-copy.h |   2 +-
>>   include/block/block_int.h  |   1 +
>>   block/backup-top.c |  14 +-
>>   block/backup.c | 113 +++---
>>   

Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-08-28 Thread Max Reitz
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Drop write notifiers and use filter node instead.
> 
> = Changes =
> 
> 1. add filter-node-name argument for backup qmp api. We have to do it
> in this commit, as 257 needs to be fixed.

I feel a bit bad about it not being an implicit node.  But I know you
don’t like them, they’re probably just broken altogether and because
libvirt doesn’t use backup (yet), probably nobody cares.

> 2. there no move write notifiers here, so is_write_notifier parameter

s/there/there are/, I suppose?

> is dropped from block-copy paths.
> 
> 3. Intersecting requests handling changed, now synchronization between
> backup-top, backup and guest writes are all done in block/block-copy.c
> and works as follows:
> 
> On copy operation, we work only with dirty areas. If bits are dirty it
> means that there are no requests intersecting with this area. We clear
> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
> prevent further operations from interaction with guest (only with
> guest, as neither backup nor backup-top will touch non-dirty area). If
> copy-operation failed we set dirty bits back together with releasing
> the lock.
> 
> The actual difference with old scheme is that on guest writes we
> don't lock the whole region but only dirty-parts, and to be more
> precise: only dirty-part we are currently operate on. In old scheme
> guest write to non-dirty area (which may be safely ignored by backup)
> may wait for intersecting request, touching some other area which is
> dirty.
> 
> 4. To sync with in-flight requests at job finish we now have drained
> removing of the filter, we don't need rw-lock.
> 
> = Notes =
> 
> Note the consequence of three objects appearing: backup-top, backup job
> and block-copy-state:
> 
> 1. We want to insert backup-top before job creation, to behave similar
> with mirror and commit, where job is started upon filter.
> 
> 2. We also have to create block-copy-state after filter injection, as
> we don't want it's source child be replaced by fitler. Instead we want

s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
ring to it)

> to keep BCS.source to be real source node, as we want to use
> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
> on filter we already have in-flight (write) request from upper layer.

Reasonable, even more so as I suppose BCS.source should eventually be a
BdrvChild of backup-top.

What looks wrong is that the sync_bitmap is created on the source (“bs”
in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
it is on blk_bs(job->common.blk) (which is no longer true).

> So, we firstly create inject backup-top, then create job and BCS. BCS
> is the latest just to not create extra variable for it. Finally we set
> bcs for backup-top filter.
> 
> = Iotest changes =
> 
> 56: op-blocker doesn't shot now, as we set it on source, but then check

s/shot/show/?

> on filter, when trying to start second backup, so error caught in
> test_dismiss_collision is changed. It's OK anyway, as this test-case
> seems to test that after some collision we can dismiss first job and
> successfully start the second one. So, the change is that collision is
> changed from op-blocker to file-posix locks.

Well, but the op blocker belongs to the source, which should be covered
by internal permissions.  The fact that it now shows up as a file-posix
error thus shows that the conflict also moves from the source to the
target.  It’s OK because we actually don’t have a conflict on the source.

But I wonder whether it would be better for test_dismiss_collision() to
do a blockdev-backup instead where we can see the collision on the target.

Hm.  On second thought, why do we even get a conflict on the target?
block-copy does share the WRITE permission for it...

> However, it's obvious now that we'd better drop this op-blocker at all
> and add a test-case for two backups from one node (to different
> destinations) actually works. But not in these series.
> 
> 257: The test wants to emulate guest write during backup. They should
> go to filter node, not to original source node, of course. Therefore we
> need to specify filter node name and use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json   |   8 +-
>  include/block/block-copy.h |   2 +-
>  include/block/block_int.h  |   1 +
>  block/backup-top.c |  14 +-
>  block/backup.c | 113 +++---
>  block/block-copy.c |  55 ---
>  block/replication.c|   2 +-
>  blockdev.c |   1 +
>  tests/qemu-iotests/056 |   2 +-
>  tests/qemu-iotests/257 |   7 +-
>  tests/qemu-iotests/257.out | 306 ++---
>  11 files changed, 230 insertions(+), 281 deletions(-)

Nice.

I don’t have much to object (for some reason, I feel a bit uneasy about
dropping all the request waiting code, but I suppose that is indeed
taken 

[Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers

2019-08-26 Thread Vladimir Sementsov-Ogievskiy
Drop write notifiers and use filter node instead.

= Changes =

1. add filter-node-name argument for backup qmp api. We have to do it
in this commit, as 257 needs to be fixed.

2. there no move write notifiers here, so is_write_notifier parameter
is dropped from block-copy paths.

3. Intersecting requests handling changed, now synchronization between
backup-top, backup and guest writes are all done in block/block-copy.c
and works as follows:

On copy operation, we work only with dirty areas. If bits are dirty it
means that there are no requests intersecting with this area. We clear
dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
prevent further operations from interaction with guest (only with
guest, as neither backup nor backup-top will touch non-dirty area). If
copy-operation failed we set dirty bits back together with releasing
the lock.

The actual difference with old scheme is that on guest writes we
don't lock the whole region but only dirty-parts, and to be more
precise: only dirty-part we are currently operate on. In old scheme
guest write to non-dirty area (which may be safely ignored by backup)
may wait for intersecting request, touching some other area which is
dirty.

4. To sync with in-flight requests at job finish we now have drained
removing of the filter, we don't need rw-lock.

= Notes =

Note the consequence of three objects appearing: backup-top, backup job
and block-copy-state:

1. We want to insert backup-top before job creation, to behave similar
with mirror and commit, where job is started upon filter.

2. We also have to create block-copy-state after filter injection, as
we don't want it's source child be replaced by fitler. Instead we want
to keep BCS.source to be real source node, as we want to use
bdrv_co_try_lock in CBW operations and it can't be used on filter, as
on filter we already have in-flight (write) request from upper layer.

So, we firstly create inject backup-top, then create job and BCS. BCS
is the latest just to not create extra variable for it. Finally we set
bcs for backup-top filter.

= Iotest changes =

56: op-blocker doesn't shot now, as we set it on source, but then check
on filter, when trying to start second backup, so error caught in
test_dismiss_collision is changed. It's OK anyway, as this test-case
seems to test that after some collision we can dismiss first job and
successfully start the second one. So, the change is that collision is
changed from op-blocker to file-posix locks.
However, it's obvious now that we'd better drop this op-blocker at all
and add a test-case for two backups from one node (to different
destinations) actually works. But not in these series.

257: The test wants to emulate guest write during backup. They should
go to filter node, not to original source node, of course. Therefore we
need to specify filter node name and use it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json   |   8 +-
 include/block/block-copy.h |   2 +-
 include/block/block_int.h  |   1 +
 block/backup-top.c |  14 +-
 block/backup.c | 113 +++---
 block/block-copy.c |  55 ---
 block/replication.c|   2 +-
 blockdev.c |   1 +
 tests/qemu-iotests/056 |   2 +-
 tests/qemu-iotests/257 |   7 +-
 tests/qemu-iotests/257.out | 306 ++---
 11 files changed, 230 insertions(+), 281 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e6edd641f1..b5cd00c361 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1391,6 +1391,11 @@
 #list without user intervention.
 #Defaults to true. (Since 2.12)
 #
+# @filter-node-name: the node name that should be assigned to the
+#filter driver that the backup job inserts into the graph
+#above node specified by @drive. If this option is not 
given,
+#a node name is autogenerated. (Since: 4.2)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1404,7 +1409,8 @@
 '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
 '*on-target-error': 'BlockdevOnError',
-'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool',
+'*filter-node-name': 'str' } }
 
 ##
 # @DriveBackup:
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 0dd7a3f7bf..c15d47a70d 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -54,6 +54,6 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
  int64_t offset, int64_t *count);
 
 int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, uint64_t bytes,
-bool *error_is_read,