10.06.26 05:33, Christian Barry пишет:
From: Christian Barry <[email protected]>
Changed qcow2_co_remove_persistent_dirty_bitmap() to treat
free_bitmap_clusters()
failure as fatal and roll back the header update through a new helper
function rollback_ext_header_and_dir_helper(), which was also applied
to update_ext_header_and_dir(), which had internal rollback logic.
Changed bitmap_list_store() to zero the cluster tail after write,
guaranteeing that qemu-img won't parse into stale bytes.
Resolves: https://gitlab.com/qemu-project/qemu/-/work_items/3516
Hi!
First, seem like two different fixes merged into one commit, please if
possible, make them separate two commits.
Next, please, make commit messages more descriptive, about what the
issue we solve (link is good as additional information, but the commit
message itself should be complete).
For example, I don't understand free_bitmap_clusters makes the image
unreadable. And we we fix it in one place, leaving other calls to
free_bitmap_clusters() unchanged. Could we fix free_bitmap_clusters()
itself? See also my comments below
Signed-off-by: Christian Barry <[email protected]>
Co-developed by: Eduardo Augusto Cavalcanti <[email protected]>
Signed-off-by: Eduardo Augusto Cavalcanti <[email protected]>
---
block/qcow2-bitmap.c | 37 ++++++++++++++++++++++++++++++++-----
1 file changed, 32 insertions(+), 5 deletions(-)
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 256ec99878..5796298157 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -737,10 +737,12 @@ static int GRAPH_RDLOCK
bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
uint64_t *offset, uint64_t *size, bool in_place)
{
+ BDRVQcow2State *s = bs->opaque;
int ret;
uint8_t *dir;
int64_t dir_offset = 0;
uint64_t dir_size = 0;
+ uint64_t tail_size;
Qcow2Bitmap *bm;
Qcow2BitmapDirEntry *e;
@@ -810,6 +812,14 @@ bitmap_list_store(BlockDriverState *bs, Qcow2BitmapList *bm_list,
goto fail;
}
+ tail_size = ROUND_UP(dir_size, s->cluster_size) - dir_size;
+ if (tail_size > 0) {
+ ret = bdrv_pwrite_zeroes(bs->file, dir_offset + dir_size, tail_size,
0);
+ if (ret < 0) {
+ goto fail;
+ }
+ }
If zeroing of unused bytes help to fix some issue, it obviously means,
that something is wrong in deeper logic.
+
g_free(dir);
if (!in_place) {
@@ -882,6 +892,19 @@ update_ext_header_and_dir_in_place(BlockDriverState *bs,
*/
}
+static void rollback_ext_header_and_dir_helper(BlockDriverState *bs,
+ uint64_t old_offset,
+ uint64_t old_size,
+ uint32_t old_nb_bitmaps,
+ uint64_t old_autocl)
+{
+ BDRVQcow2State *s = bs->opaque;
+ s->bitmap_directory_offset = old_offset;
+ s->bitmap_directory_size = old_size;
+ s->nb_bitmaps = old_nb_bitmaps;
+ s->autoclear_features = old_autocl;
+}
I think, better to inline this code, or refactor deeper. This way it only
makes code more complicated: we can't deduplicate first part of logic,
which is defining these 4 variables, but do that for the second part.
+
static int GRAPH_RDLOCK
update_ext_header_and_dir(BlockDriverState *bs, Qcow2BitmapList *bm_list)
{
@@ -937,10 +960,7 @@ fail:
qcow2_free_clusters(bs, new_offset, new_size, QCOW2_DISCARD_OTHER);
}
- s->bitmap_directory_offset = old_offset;
- s->bitmap_directory_size = old_size;
- s->nb_bitmaps = old_nb_bitmaps;
- s->autoclear_features = old_autocl;
+ rollback_ext_header_and_dir_helper(bs, old_offset, old_size,
old_nb_bitmaps, old_autocl);
return ret;
}
@@ -1460,6 +1480,10 @@ int coroutine_fn
qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
{
int ret;
BDRVQcow2State *s = bs->opaque;
+ uint64_t old_offset = s->bitmap_directory_offset;
+ uint64_t old_size = s->bitmap_directory_size;
+ uint32_t old_nb_bitmaps = s->nb_bitmaps;
+ uint64_t old_autocl = s->autoclear_features;
Qcow2Bitmap *bm = NULL;
Qcow2BitmapList *bm_list;
@@ -1495,7 +1519,10 @@ int coroutine_fn qcow2_co_remove_persistent_dirty_bitmap(BlockDriverState *bs,
goto out;
}
- free_bitmap_clusters(bs, &bm->table);
+ ret = free_bitmap_clusters(bs, &bm->table);
+ if (ret < 0){
+ rollback_ext_header_and_dir_helper(bs, old_offset, old_size,
old_nb_bitmaps, old_autocl);
+ }
Again, this looks strange too. Ignoring failure of freeing _unused_ clusters
should not be a problem. If it is, it means that things are broken earlier
and something should be fixed at earlier stage.
out:
qemu_co_mutex_unlock(&s->lock);
---
Let me try to understand the issue at
https://gitlab.com/qemu-project/qemu/-/work_items/3516..
1. We failed to flush bitmaps. So, autoclear bit should be unset in the header
and the whole bitmap extension considered inconsistent
2. On next qcow2 open, no bitmaps should be loaded, and we'll see message of
error_printf("Some clusters may be leaked, "
"run 'qemu-img check -r' on the image "
"file to fix.");
What's wrong with that?
--
Best regards,
Vladimir