Re: [PATCH v2 1/2] qcow2: add keep-dirty open option
On 27.01.2022 20:00, Vladimir Sementsov-Ogievskiy wrote: > Consider the case: > > Thirdparty component works with qcow2 image, and dirty bit is set. > > Thirdparty component want to start qemu-img to do some manipulation. > Ofcourse, third party component flushes refcounts and other metadata > before starting QEMU. > > But the component don't want to clear dirty bit, as this breaks > transactionability of the operation: we'll have to set it again but it > may fail. Clearing the dirty bit is unrecoverable action and can't be > transactional. That's a problem. > > The solution is a new qcow2 open option: keep-dirty. When set: > 1. On qcow2 open, ignore dirty bit and don't do check: caller is >responsible for refcounts being valid. > 2. Never clear dirty bit during QEMU execution, including close. > > Details: > > 1. For simplicity let's just not allow keep-dirty without lazy >refcounts. > > 2. Don't allow to open with keep-dirty when dirty bit is unset. This >may mean some error in user logic. > > 3. For implementation do the following: dirty flag >in s->incompatible_features behaves same way as without keep-dirty >option: it actually designate status of refcounts dirtiness. But we >never clear the flag in the image, and we remember that it is always >set. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Tested-by: Kirill Tkhai > --- > qapi/block-core.json | 8 ++ > block/qcow2.h| 2 ++ > block/qcow2.c| 61 +++- > 3 files changed, 59 insertions(+), 12 deletions(-) > > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 9a5a3641d0..4b5c6d7935 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -3228,6 +3228,13 @@ > # @lazy-refcounts: whether to enable the lazy refcounts > # feature (default is taken from the image file) > # > +# @keep-dirty: whether to not touch dirty bit. When set, QEMU doesn't > +# check refcounts on qcow2 open (ignoring dirty bit) and doesn't > +# clear dirty bit on qcow2 close. When true dirty bit must > +# be already set in the image on open, otherwise open fails. > +# When true user guarantees that refcounts are consistent on > +# open, so the check is omitted. (since 7.0) > +# > # @pass-discard-request: whether discard requests to the qcow2 > #device should be forwarded to the data source > # > @@ -3276,6 +3283,7 @@ > { 'struct': 'BlockdevOptionsQcow2', >'base': 'BlockdevOptionsGenericCOWFormat', >'data': { '*lazy-refcounts': 'bool', > +'*keep-dirty': 'bool', > '*pass-discard-request': 'bool', > '*pass-discard-snapshot': 'bool', > '*pass-discard-other': 'bool', > diff --git a/block/qcow2.h b/block/qcow2.h > index fd48a89d45..696e13377a 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -130,6 +130,7 @@ > > #define QCOW2_OPT_DATA_FILE "data-file" > #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" > +#define QCOW2_OPT_KEEP_DIRTY "keep-dirty" > #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" > #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot" > #define QCOW2_OPT_DISCARD_OTHER "pass-discard-other" > @@ -376,6 +377,7 @@ typedef struct BDRVQcow2State { > int flags; > int qcow_version; > bool use_lazy_refcounts; > +bool keep_dirty; > int refcount_order; > int refcount_bits; > uint64_t refcount_max; > diff --git a/block/qcow2.c b/block/qcow2.c > index d509016756..ee82ef2a8f 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -514,15 +514,17 @@ int qcow2_mark_dirty(BlockDriverState *bs) > return 0; /* already dirty */ > } > > -val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY); > -ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features), > - , sizeof(val)); > -if (ret < 0) { > -return ret; > -} > -ret = bdrv_flush(bs->file->bs); > -if (ret < 0) { > -return ret; > +if (!s->keep_dirty) { > +val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY); > +ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, > incompatible_features), > + , sizeof(val)); > +if (ret < 0) { > +return ret; > +} > +ret = bdrv_flush(bs->file->bs); > +if (ret < 0) { > +return ret; > +} &g
Re: [PATCH 1/2] qcow2: add keep-dirty open option
On 20.01.2022 19:20, Vladimir Sementsov-Ogievskiy wrote: > Consider the case: > > Thirdparty component works with qcow2 image, and dirty bit is set. > > Thirdparty component want to start qemu-img to do some manipulation. > Ofcourse, third party component flushes refcounts and other metadata > before starting QEMU. > > But the component don't want to clear dirty bit, as this breaks > transactionability of the operation: we'll have to set it again but it > may fail. Clearing the dirty bit is unrecoverable action and can't be > transactional. That's a problem. > > The solution is a new qcow2 open option: keep-dirty. When set: > 1. On qcow2 open, ignore dirty bit and don't do check: caller is >responsible for refcounts being valid. > 2. Never clear dirty bit during QEMU execution, including close. > > Details: > > 1. For simplicity let's just not allow keep-dirty without lazy >refcounts. > > 2. Don't allow to open with keep-dirty when dirty bit is unset. This >may mean some error in user logic. > > 3. For implementation do the following: dirty flag >in s->incompatible_features behaves same way as without keep-dirty >option: it actually designate status of refcounts dirtiness. But we >never clear the flag in the image, and we remember that it is always >set. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > qapi/block-core.json | 5 > block/qcow2.h| 2 ++ > block/qcow2.c| 66 > 3 files changed, 61 insertions(+), 12 deletions(-) > > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 9a5a3641d0..3e35357182 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -3228,6 +3228,10 @@ > # @lazy-refcounts: whether to enable the lazy refcounts > # feature (default is taken from the image file) > # > +# @keep-dirty: whether to not touch dirty bit. When set, QEMU doesn't > +# check refcounts on qcow2 open (ignoring dirty bit) and doesn't > +# clear dirty bit on qcow2 close. (since 7.0) > +# > # @pass-discard-request: whether discard requests to the qcow2 > #device should be forwarded to the data source > # > @@ -3276,6 +3280,7 @@ > { 'struct': 'BlockdevOptionsQcow2', >'base': 'BlockdevOptionsGenericCOWFormat', >'data': { '*lazy-refcounts': 'bool', > +'*keep-dirty': 'bool', > '*pass-discard-request': 'bool', > '*pass-discard-snapshot': 'bool', > '*pass-discard-other': 'bool', > diff --git a/block/qcow2.h b/block/qcow2.h > index fd48a89d45..696e13377a 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -130,6 +130,7 @@ > > #define QCOW2_OPT_DATA_FILE "data-file" > #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" > +#define QCOW2_OPT_KEEP_DIRTY "keep-dirty" > #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" > #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot" > #define QCOW2_OPT_DISCARD_OTHER "pass-discard-other" > @@ -376,6 +377,7 @@ typedef struct BDRVQcow2State { > int flags; > int qcow_version; > bool use_lazy_refcounts; > +bool keep_dirty; > int refcount_order; > int refcount_bits; > uint64_t refcount_max; > diff --git a/block/qcow2.c b/block/qcow2.c > index d509016756..1c42103fb9 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -514,15 +514,17 @@ int qcow2_mark_dirty(BlockDriverState *bs) > return 0; /* already dirty */ > } > > -val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY); > -ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features), > - , sizeof(val)); > -if (ret < 0) { > -return ret; > -} > -ret = bdrv_flush(bs->file->bs); > -if (ret < 0) { > -return ret; > +if (!s->keep_dirty) { > +val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY); > +ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, > incompatible_features), > + , sizeof(val)); > +if (ret < 0) { > +return ret; > +} > +ret = bdrv_flush(bs->file->bs); > +if (ret < 0) { > +return ret; > +} > } > > /* Only treat image as dirty if the header was updated successfully */ > @@ -549,7 +551,13 @@ static int qcow2_mark_clean(BlockDriverState *bs) > return ret; > } > > -return qcow2_update_header(bs); > +if (!s->keep_dirty) { > +/* > + * No reason to update the header if we don't want to clear dirty > + * bit. > + */ > +return qcow2_update_header(bs); > +} > } > return 0; > } > @@ -709,6 +717,11 @@ static QemuOptsList qcow2_runtime_opts = { > .type = QEMU_OPT_BOOL, > .help = "Postpone refcount updates", > }, > +{ > +
Re: [PATCH v2 10/10] qcow2-refcount: check_refblocks(): add separate message for reserved
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Split checking for reserved bits out of aligned offset check. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake Tested-by: Kirill Tkhai > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 10 +- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index 58fd7f1678..fd48a89d45 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -591,6 +591,7 @@ typedef enum QCow2MetadataOverlap { > #define L2E_STD_RESERVED_MASK 0x3f0001feULL > > #define REFT_OFFSET_MASK 0xfe00ULL > +#define REFT_RESERVED_MASK 0x1ffULL > > #define INV_OFFSET (-1ULL) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 15c4f6b075..472a7026db 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -2091,9 +2091,17 @@ static int check_refblocks(BlockDriverState *bs, > BdrvCheckResult *res, > > for(i = 0; i < s->refcount_table_size; i++) { > uint64_t offset, cluster; > -offset = s->refcount_table[i]; > +offset = s->refcount_table[i] & REFT_OFFSET_MASK; > cluster = offset >> s->cluster_bits; > > +if (s->refcount_table[i] & REFT_RESERVED_MASK) { > +fprintf(stderr, "ERROR refcount table entry %" PRId64 " has " > +"reserved bits set\n", i); > +res->corruptions++; > +*rebuild = true; > +continue; > +} > + > /* Refcount blocks are cluster aligned */ > if (offset_into_cluster(s, offset)) { > fprintf(stderr, "ERROR refcount block %" PRId64 " is not " >
Re: [PATCH v2 09/10] qcow2-refcount: check_refcounts_l1(): check reserved bits
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 6 ++ > 2 files changed, 7 insertions(+) > > diff --git a/block/qcow2.h b/block/qcow2.h > index b8b1093b61..58fd7f1678 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -586,6 +586,7 @@ typedef enum QCow2MetadataOverlap { > (QCOW2_OL_CACHED | QCOW2_OL_INACTIVE_L2) > > #define L1E_OFFSET_MASK 0x00fffe00ULL > +#define L1E_RESERVED_MASK 0x7f0001ffULL > #define L2E_OFFSET_MASK 0x00fffe00ULL > #define L2E_STD_RESERVED_MASK 0x3f0001feULL > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 69294a94fe..15c4f6b075 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1904,6 +1904,12 @@ static int check_refcounts_l1(BlockDriverState *bs, > continue; > } > > +if (l1_table[i] & L1E_RESERVED_MASK) { > +fprintf(stderr, "ERROR found L1 entry with reserved bits set: " > + "%" PRIx64, l1_table[i]); '\n' is missed. The rest is OK for me. Tested-by: Kirill Tkhai > +res->corruptions++; > +} > + > l2_offset = l1_table[i] & L1E_OFFSET_MASK; > > /* Mark L2 table as used */ >
Re: [PATCH v2 07/10] qcow2-refcount: check_refcounts_l2(): check reserved bits
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 12 +++- > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index c0e1e83796..b8b1093b61 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -587,6 +587,7 @@ typedef enum QCow2MetadataOverlap { > > #define L1E_OFFSET_MASK 0x00fffe00ULL > #define L2E_OFFSET_MASK 0x00fffe00ULL > +#define L2E_STD_RESERVED_MASK 0x3f0001feULL > > #define REFT_OFFSET_MASK 0xfe00ULL > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 062ec48a15..47cc82449b 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1682,8 +1682,18 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > int csize; > l2_entry = get_l2_entry(s, l2_table, i); > uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > +QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry); > > -switch (qcow2_get_cluster_type(bs, l2_entry)) { > +if (type != QCOW2_CLUSTER_COMPRESSED) { > +/* Check reserved bits of Standard Cluster Descriptor */ > +if (l2_entry & L2E_STD_RESERVED_MASK) { > +fprintf(stderr, "ERROR found l2 entry with reserved bits > set: " > +"%" PRIx64, l2_entry); '\n' is missed. The rest is OK for me. Tested-by: Kirill Tkhai > +res->corruptions++; > +} > +} > + > +switch (type) { > case QCOW2_CLUSTER_COMPRESSED: > /* Compressed clusters don't have QCOW_OFLAG_COPIED */ > if (l2_entry & QCOW_OFLAG_COPIED) { >
Re: [PATCH v2 06/10] qcow2-refcount: check_refcounts_l2(): check l2_bitmap
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Check subcluster bitmap of the l2 entry for different types of > clusters: > > - for compressed it must be zero > - for allocated check consistency of two parts of the bitmap > - for unallocated all subclusters should be unallocated >(or zero-plain) > > For unallocated clusters we can safely fix the entry by making it > zero-plain. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake Tested-by: Kirill Tkhai > --- > block/qcow2-refcount.c | 30 +- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index f48c5e1b5d..062ec48a15 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1681,6 +1681,7 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > uint64_t coffset; > int csize; > l2_entry = get_l2_entry(s, l2_table, i); > +uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > > switch (qcow2_get_cluster_type(bs, l2_entry)) { > case QCOW2_CLUSTER_COMPRESSED: > @@ -1700,6 +1701,14 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > break; > } > > +if (l2_bitmap) { > +fprintf(stderr, "ERROR compressed cluster %d with non-zero " > +"subcluster allocation bitmap, entry=0x%" PRIx64 > "\n", > +i, l2_entry); > +res->corruptions++; > +break; > +} > + > /* Mark cluster as used */ > qcow2_parse_compressed_l2_entry(bs, l2_entry, , ); > ret = qcow2_inc_refcounts_imrt( > @@ -1727,13 +1736,19 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > { > uint64_t offset = l2_entry & L2E_OFFSET_MASK; > > +if ((l2_bitmap >> 32) & l2_bitmap) { > +res->corruptions++; > +fprintf(stderr, "ERROR offset=%" PRIx64 ": Allocated " > +"cluster has corrupted subcluster allocation > bitmap\n", > +offset); > +} > + > /* Correct offsets are cluster aligned */ > if (offset_into_cluster(s, offset)) { > bool contains_data; > res->corruptions++; > > if (has_subclusters(s)) { > -uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > contains_data = (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC); > } else { > contains_data = !(l2_entry & QCOW_OFLAG_ZERO); > @@ -1800,6 +1815,19 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > > case QCOW2_CLUSTER_ZERO_PLAIN: > case QCOW2_CLUSTER_UNALLOCATED: > +if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) { > +res->corruptions++; > +fprintf(stderr, "%s: Unallocated " > +"cluster has non-zero subcluster allocation map\n", > +fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR"); > +if (fix & BDRV_FIX_ERRORS) { > +ret = fix_l2_entry_by_zero(bs, res, l2_offset, l2_table, > i, > + active, _overlap); > +if (metadata_overlap) { > +return ret; > +} > +} > +} > break; > > default: >
Re: [PATCH] qcow2: set bdi->is_dirty
On 04.05.2021 19:06, Vladimir Sementsov-Ogievskiy wrote: > Set bdi->is_dirty, so that qemu-img info could show dirty flag. > > After this commit the following check will show '"dirty-flag": true': > > ./build/qemu-img create -f qcow2 -o lazy_refcounts=on x 1M > ./build/qemu-io x > qemu-io> write 0 1M > > After "write" command success, kill the qemu-io process: > > kill -9 > > ./build/qemu-img info --output=json x > > This will show '"dirty-flag": true' among other things. (before this > commit it shows '"dirty-flag": false') > > Note, that qcow2's dirty-bit is not a "dirty bit for the image". It > only protects qcow2 lazy refcounts feature. So, there are a lot of > conditions when qcow2 session may be not closed correctly, but bit is > 0. Still, when bit is set, the last session is definitely not finished > correctly and it's better to report it. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Works for me. Thanks, Tested-by: Kirill Tkhai > --- > block/qcow2.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/qcow2.c b/block/qcow2.c > index 9727ae8fe3..39b91ef940 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -5089,6 +5089,7 @@ static int qcow2_get_info(BlockDriverState *bs, > BlockDriverInfo *bdi) > BDRVQcow2State *s = bs->opaque; > bdi->cluster_size = s->cluster_size; > bdi->vm_state_offset = qcow2_vm_state_offset(s); > +bdi->is_dirty = s->incompatible_features & QCOW2_INCOMPAT_DIRTY; > return 0; > } > >