Re: [Qemu-block] [Qemu-devel] [PATCH v2] migration/dirty-bitmaps: change bitmap enumeration method

Vladimir Sementsov-Ogievskiy Fri, 17 May 2019 04:00:15 -0700

16.05.2019 22:03, John Snow wrote:
> 
> 
> On 5/16/19 6:12 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 14.05.2019 23:19, John Snow wrote:
>>> Shift from looking at every root BDS to *every* BDS. This will migrate
>>> bitmaps that are attached to blockdev created nodes instead of just ones
>>> attached to emulated storage devices.
>>>
>>> Note that this will not migrate anonymous or internal-use bitmaps, as
>>> those are defined as having no name.
>>>
>>> This will also fix the Coverity issues Peter Maydell has been asking
>>> about for the past several releases, as well as fixing a real bug.
>>>
>>> Reported-by: Peter Maydell <peter.mayd...@linaro.org>
>>> Reported-by: Coverity 😅
>>> Reported-by: aihua liang <ali...@redhat.com>
>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490
>>> Fixes: Coverity CID 1390625
>>> CC: Stefan Hajnoczi <stefa...@redhat.com>
>>> Signed-off-by: John Snow <js...@redhat.com>
>>> ---
>>>    migration/block-dirty-bitmap.c | 14 ++++----------
>>>    1 file changed, 4 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
>>> index d1bb863cb6..4a896a09eb 100644
>>> --- a/migration/block-dirty-bitmap.c
>>> +++ b/migration/block-dirty-bitmap.c
>>> @@ -273,7 +273,6 @@ static int init_dirty_bitmap_migration(void)
>>>        BlockDriverState *bs;
>>>        BdrvDirtyBitmap *bitmap;
>>>        DirtyBitmapMigBitmapState *dbms;
>>> -    BdrvNextIterator it;
>>>        Error *local_err = NULL;
>>>    
>>>        dirty_bitmap_mig_state.bulk_completed = false;
>>> @@ -281,13 +280,8 @@ static int init_dirty_bitmap_migration(void)
>>>        dirty_bitmap_mig_state.prev_bitmap = NULL;
>>>        dirty_bitmap_mig_state.no_bitmaps = false;
>>>    
>>> -    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
>>> -        const char *drive_name = bdrv_get_device_or_node_name(bs);
>>> -
>>> -        /* skip automatically inserted nodes */
>>> -        while (bs && bs->drv && bs->implicit) {
>>> -            bs = backing_bs(bs);
>>> -        }
>>
>> hm, so, after the patch, for implicitly-filtered nodes we'll have node_name 
>> instead of device name..
>>
> 
> Oh, I see -- this does change what we send over the wire for
> interior/leaf nodes; that was unintentional on my part.
> 
> After my patch, this requires that if you have a manually constructed
> tree such that you have attached a bitmap to an interior or leaf node,
> you *need* to name that node so that it can be consistently
> reconstructed at the target.
> 
> I think that's a reasonable requirement and is actually superior to
> re-attaching all bitmaps to the root on migration (which would have
> happened before.)
> 
> Codewise, what we have currently (both before and after this patch) is:
> 
>      if (flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
>          qemu_put_counted_string(f, dbms->node_name);
>      }
> 
> So we named the constant "DEVICE_NAME", but the field was already named
> node_name, so this seems fine on the sending end. In practice, pre-patch
> we sent a device_name for any node that was the root attached to a
> device. Post-patch, that doesn't change because I am using the same API
> call to retrieve the name.
> 
> For interior/leaf nodes, we now send the node-name specifically instead
> of the name of the device root. This requires identically constructed
> (or at least compatibly named) graphs on the source and destination,
> which is a reasonable requirement for migration.
> 
> On the receiving end, we have this code:
> 
>      if (s->flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
>          if (!qemu_get_counted_string(f, s->node_name)) {
>              error_report("Unable to read node name string");
>              return -EINVAL;
>       }
>          s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
> 
> which looks like a correct mapping. I think this is a safe change, even
> though I made it somewhat unintentionally.
> 
>> But, on the other, hand, if we have implicitly-filtered node on target, we 
>> were doing wrong thing anyway,
>> as dirty_bitmap_load_header don't skip implicit nodes.
>>
>>> +    for (bs = bdrv_next_all_states(NULL); bs; bs = 
>>> bdrv_next_all_states(bs)) {
>>
>> As I understand, difference with bdrv_next_node is that we don't skip 
>> unnamed nodes [...]
>>
> 
> The difference is that we iterate over states that aren't roots of
> trees; so not just unnamed nodes but rather intermediate nodes as well
> as nodes not attached to a storage device.
> 
> bdrv_next says: `Iterates over all top-level BlockDriverStates, i.e.
> BDSs that are owned by the monitor or attached to a BlockBackend`
> 
> So this loop is going to iterate over *everything*, not just top-level
> nodes. This lets me skip the tree-crawling loop that didn't work quite
> right.


I meant not bdrv_next but bdrv_next_node, which iterates through graph nodes..
What is real difference between graph_bdrv_states and all_bdrv_states ?

Node is inserted to graph_bdrv_states in bdrv_assign_node_name(), and to
all_bdrv_states in bdrv_new().

Three calls to bdrv_new:

bdrv_new_open_driver, calls bdrv_new and then bdrv_open_driver, which calls 
bdrv_assign_node_name,
and if it fails new created node is released.

bdrv_open_inherit
    bdrv_new
    ..
    bdrv_open_common
       bdrv_open_driver
           bdrv_assign_node_name


iscsi_co_create_opts
    bdrv_new

    ... hmm.. looks like it a kind of temporary unnamed bs


So, now, I'm not sure. Possibly we'd better use bdrv_next_node().

Kevin introduced all_bdrv_states in 0f12264e7a4145 , to use in drain instead of
bdrv_next... But I don't understand, why he didn't use graph_bdrv_states and
corresponding bdrv_next_node(), which is only skips some temporary or 
under-constuction
nodes..

> 
>>> +        const char *name = bdrv_get_device_or_node_name(bs);
>>>    
>>>            for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap;
>>>                 bitmap = bdrv_dirty_bitmap_next(bs, bitmap))
>>> @@ -296,7 +290,7 @@ static int init_dirty_bitmap_migration(void)
>>>                    continue;
>>>                }
>>>    
>>> -            if (drive_name == NULL) {
>>> +            if (!name || strcmp(name, "") == 0) {
>>
>> [...] to do this (may be paranoiac, but why not?) check
>>
> 
> Because bdrv_get_device_or_node_name technically has the capability to
> return an empty string, though I believe in contemporary block layer
> that we always generate a node-name. So it might be paranoid, but it
> matches the API I'm calling as documented.
> 
>>>                    error_report("Found bitmap '%s' in unnamed node %p. It 
>>> can't "
>>>                                 "be migrated", 
>>> bdrv_dirty_bitmap_name(bitmap), bs);
>>>                    goto fail;
>>> @@ -313,7 +307,7 @@ static int init_dirty_bitmap_migration(void)
>>>    
>>>                dbms = g_new0(DirtyBitmapMigBitmapState, 1);
>>>                dbms->bs = bs;
>>> -            dbms->node_name = drive_name;
>>> +            dbms->node_name = name;
>>>                dbms->bitmap = bitmap;
>>>                dbms->total_sectors = bdrv_nb_sectors(bs);
>>>                dbms->sectors_per_chunk = CHUNK_SIZE * 8 *
>>>
>>
>> There is still some mess about device name vs node name, and I don't know, 
>> could we somehow
>> solve it, but patch looks OK for me anyway:
>>
> 
> Yes, there's more mess, but I think this is Less Wrong™️. I will try to
> combat some of this confusion soon; but I'll look at your cross-node
> merge patch first.
> 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
>>
> 
> Does this review still stand after my clarifications?
> 

Let's figure out graph_bdrv_states vs all_bdrv_states first.


-- 
Best regards,
Vladimir

Re: [Qemu-block] [Qemu-devel] [PATCH v2] migration/dirty-bitmaps: change bitmap enumeration method

Reply via email to