Re: [PATCH v2 1/2] qcow2: add keep-dirty open option

2022-01-28 Thread Kirill Tkhai
On 27.01.2022 20:00, Vladimir Sementsov-Ogievskiy wrote:
> Consider the case:
> 
> Thirdparty component works with qcow2 image, and dirty bit is set.
> 
> Thirdparty component want to start qemu-img to do some manipulation.
> Ofcourse, third party component flushes refcounts and other metadata
> before starting QEMU.
> 
> But the component don't want to clear dirty bit, as this breaks
> transactionability of the operation: we'll have to set it again but it
> may fail. Clearing the dirty bit is unrecoverable action and can't be
> transactional. That's a problem.
> 
> The solution is a new qcow2 open option: keep-dirty. When set:
> 1. On qcow2 open, ignore dirty bit and don't do check: caller is
>responsible for refcounts being valid.
> 2. Never clear dirty bit during QEMU execution, including close.
> 
> Details:
> 
> 1. For simplicity let's just not allow keep-dirty without lazy
>refcounts.
> 
> 2. Don't allow to open with keep-dirty when dirty bit is unset. This
>may mean some error in user logic.
> 
> 3. For implementation do the following: dirty flag
>in s->incompatible_features behaves same way as without keep-dirty
>option: it actually designate status of refcounts dirtiness. But we
>never clear the flag in the image, and we remember that it is always
>set.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Tested-by: Kirill Tkhai 

> ---
>  qapi/block-core.json |  8 ++
>  block/qcow2.h|  2 ++
>  block/qcow2.c| 61 +++-
>  3 files changed, 59 insertions(+), 12 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 9a5a3641d0..4b5c6d7935 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3228,6 +3228,13 @@
>  # @lazy-refcounts: whether to enable the lazy refcounts
>  #  feature (default is taken from the image file)
>  #
> +# @keep-dirty: whether to not touch dirty bit. When set, QEMU doesn't
> +#  check refcounts on qcow2 open (ignoring dirty bit) and doesn't
> +#  clear dirty bit on qcow2 close. When true dirty bit must
> +#  be already set in the image on open, otherwise open fails.
> +#  When true user guarantees that refcounts are consistent on
> +#  open, so the check is omitted. (since 7.0)
> +#
>  # @pass-discard-request: whether discard requests to the qcow2
>  #device should be forwarded to the data source
>  #
> @@ -3276,6 +3283,7 @@
>  { 'struct': 'BlockdevOptionsQcow2',
>'base': 'BlockdevOptionsGenericCOWFormat',
>'data': { '*lazy-refcounts': 'bool',
> +'*keep-dirty': 'bool',
>  '*pass-discard-request': 'bool',
>  '*pass-discard-snapshot': 'bool',
>  '*pass-discard-other': 'bool',
> diff --git a/block/qcow2.h b/block/qcow2.h
> index fd48a89d45..696e13377a 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -130,6 +130,7 @@
>  
>  #define QCOW2_OPT_DATA_FILE "data-file"
>  #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
> +#define QCOW2_OPT_KEEP_DIRTY "keep-dirty"
>  #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
>  #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
>  #define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
> @@ -376,6 +377,7 @@ typedef struct BDRVQcow2State {
>  int flags;
>  int qcow_version;
>  bool use_lazy_refcounts;
> +bool keep_dirty;
>  int refcount_order;
>  int refcount_bits;
>  uint64_t refcount_max;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d509016756..ee82ef2a8f 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -514,15 +514,17 @@ int qcow2_mark_dirty(BlockDriverState *bs)
>  return 0; /* already dirty */
>  }
>  
> -val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
> -ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features),
> -  , sizeof(val));
> -if (ret < 0) {
> -return ret;
> -}
> -ret = bdrv_flush(bs->file->bs);
> -if (ret < 0) {
> -return ret;
> +if (!s->keep_dirty) {
> +val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
> +ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, 
> incompatible_features),
> +  , sizeof(val));
> +if (ret < 0) {
> +return ret;
> +}
> +ret = bdrv_flush(bs->file->bs);
> +if (ret < 0) {
> +return ret;
> +}
&g

Re: [PATCH 1/2] qcow2: add keep-dirty open option

2022-01-27 Thread Kirill Tkhai
On 20.01.2022 19:20, Vladimir Sementsov-Ogievskiy wrote:
> Consider the case:
> 
> Thirdparty component works with qcow2 image, and dirty bit is set.
> 
> Thirdparty component want to start qemu-img to do some manipulation.
> Ofcourse, third party component flushes refcounts and other metadata
> before starting QEMU.
> 
> But the component don't want to clear dirty bit, as this breaks
> transactionability of the operation: we'll have to set it again but it
> may fail. Clearing the dirty bit is unrecoverable action and can't be
> transactional. That's a problem.
> 
> The solution is a new qcow2 open option: keep-dirty. When set:
> 1. On qcow2 open, ignore dirty bit and don't do check: caller is
>responsible for refcounts being valid.
> 2. Never clear dirty bit during QEMU execution, including close.
> 
> Details:
> 
> 1. For simplicity let's just not allow keep-dirty without lazy
>refcounts.
> 
> 2. Don't allow to open with keep-dirty when dirty bit is unset. This
>may mean some error in user logic.
> 
> 3. For implementation do the following: dirty flag
>in s->incompatible_features behaves same way as without keep-dirty
>option: it actually designate status of refcounts dirtiness. But we
>never clear the flag in the image, and we remember that it is always
>set.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json |  5 
>  block/qcow2.h|  2 ++
>  block/qcow2.c| 66 
>  3 files changed, 61 insertions(+), 12 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 9a5a3641d0..3e35357182 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3228,6 +3228,10 @@
>  # @lazy-refcounts: whether to enable the lazy refcounts
>  #  feature (default is taken from the image file)
>  #
> +# @keep-dirty: whether to not touch dirty bit. When set, QEMU doesn't
> +#  check refcounts on qcow2 open (ignoring dirty bit) and doesn't
> +#  clear dirty bit on qcow2 close. (since 7.0)
> +#
>  # @pass-discard-request: whether discard requests to the qcow2
>  #device should be forwarded to the data source
>  #
> @@ -3276,6 +3280,7 @@
>  { 'struct': 'BlockdevOptionsQcow2',
>'base': 'BlockdevOptionsGenericCOWFormat',
>'data': { '*lazy-refcounts': 'bool',
> +'*keep-dirty': 'bool',
>  '*pass-discard-request': 'bool',
>  '*pass-discard-snapshot': 'bool',
>  '*pass-discard-other': 'bool',
> diff --git a/block/qcow2.h b/block/qcow2.h
> index fd48a89d45..696e13377a 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -130,6 +130,7 @@
>  
>  #define QCOW2_OPT_DATA_FILE "data-file"
>  #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
> +#define QCOW2_OPT_KEEP_DIRTY "keep-dirty"
>  #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
>  #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
>  #define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
> @@ -376,6 +377,7 @@ typedef struct BDRVQcow2State {
>  int flags;
>  int qcow_version;
>  bool use_lazy_refcounts;
> +bool keep_dirty;
>  int refcount_order;
>  int refcount_bits;
>  uint64_t refcount_max;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d509016756..1c42103fb9 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -514,15 +514,17 @@ int qcow2_mark_dirty(BlockDriverState *bs)
>  return 0; /* already dirty */
>  }
>  
> -val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
> -ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, incompatible_features),
> -  , sizeof(val));
> -if (ret < 0) {
> -return ret;
> -}
> -ret = bdrv_flush(bs->file->bs);
> -if (ret < 0) {
> -return ret;
> +if (!s->keep_dirty) {
> +val = cpu_to_be64(s->incompatible_features | QCOW2_INCOMPAT_DIRTY);
> +ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, 
> incompatible_features),
> +  , sizeof(val));
> +if (ret < 0) {
> +return ret;
> +}
> +ret = bdrv_flush(bs->file->bs);
> +if (ret < 0) {
> +return ret;
> +}
>  }
>  
>  /* Only treat image as dirty if the header was updated successfully */
> @@ -549,7 +551,13 @@ static int qcow2_mark_clean(BlockDriverState *bs)
>  return ret;
>  }
>  
> -return qcow2_update_header(bs);
> +if (!s->keep_dirty) {
> +/*
> + * No reason to update the header if we don't want to clear dirty
> + * bit.
> + */
> +return qcow2_update_header(bs);
> +}
>  }
>  return 0;
>  }
> @@ -709,6 +717,11 @@ static QemuOptsList qcow2_runtime_opts = {
>  .type = QEMU_OPT_BOOL,
>  .help = "Postpone refcount updates",
>  },
> +{
> +

Re: [PATCH v2 10/10] qcow2-refcount: check_refblocks(): add separate message for reserved

2021-05-11 Thread Kirill Tkhai
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote:
> Split checking for reserved bits out of aligned offset check.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Eric Blake 

Tested-by: Kirill Tkhai 

> ---
>  block/qcow2.h  |  1 +
>  block/qcow2-refcount.c | 10 +-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 58fd7f1678..fd48a89d45 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -591,6 +591,7 @@ typedef enum QCow2MetadataOverlap {
>  #define L2E_STD_RESERVED_MASK 0x3f0001feULL
>  
>  #define REFT_OFFSET_MASK 0xfe00ULL
> +#define REFT_RESERVED_MASK 0x1ffULL
>  
>  #define INV_OFFSET (-1ULL)
>  
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 15c4f6b075..472a7026db 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -2091,9 +2091,17 @@ static int check_refblocks(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  
>  for(i = 0; i < s->refcount_table_size; i++) {
>  uint64_t offset, cluster;
> -offset = s->refcount_table[i];
> +offset = s->refcount_table[i] & REFT_OFFSET_MASK;
>  cluster = offset >> s->cluster_bits;
>  
> +if (s->refcount_table[i] & REFT_RESERVED_MASK) {
> +fprintf(stderr, "ERROR refcount table entry %" PRId64 " has "
> +"reserved bits set\n", i);
> +res->corruptions++;
> +*rebuild = true;
> +continue;
> +}
> +
>  /* Refcount blocks are cluster aligned */
>  if (offset_into_cluster(s, offset)) {
>  fprintf(stderr, "ERROR refcount block %" PRId64 " is not "
> 




Re: [PATCH v2 09/10] qcow2-refcount: check_refcounts_l1(): check reserved bits

2021-05-11 Thread Kirill Tkhai
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2.h  | 1 +
>  block/qcow2-refcount.c | 6 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index b8b1093b61..58fd7f1678 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -586,6 +586,7 @@ typedef enum QCow2MetadataOverlap {
>  (QCOW2_OL_CACHED | QCOW2_OL_INACTIVE_L2)
>  
>  #define L1E_OFFSET_MASK 0x00fffe00ULL
> +#define L1E_RESERVED_MASK 0x7f0001ffULL
>  #define L2E_OFFSET_MASK 0x00fffe00ULL
>  #define L2E_STD_RESERVED_MASK 0x3f0001feULL
>  
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 69294a94fe..15c4f6b075 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1904,6 +1904,12 @@ static int check_refcounts_l1(BlockDriverState *bs,
>  continue;
>  }
>  
> +if (l1_table[i] & L1E_RESERVED_MASK) {
> +fprintf(stderr, "ERROR found L1 entry with reserved bits set: "
> +    "%" PRIx64, l1_table[i]);

'\n' is missed.

The rest is OK for me.

Tested-by: Kirill Tkhai 

> +res->corruptions++;
> +}
> +
>  l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>  
>  /* Mark L2 table as used */
> 




Re: [PATCH v2 07/10] qcow2-refcount: check_refcounts_l2(): check reserved bits

2021-05-11 Thread Kirill Tkhai
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2.h  |  1 +
>  block/qcow2-refcount.c | 12 +++-
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index c0e1e83796..b8b1093b61 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -587,6 +587,7 @@ typedef enum QCow2MetadataOverlap {
>  
>  #define L1E_OFFSET_MASK 0x00fffe00ULL
>  #define L2E_OFFSET_MASK 0x00fffe00ULL
> +#define L2E_STD_RESERVED_MASK 0x3f0001feULL
>  
>  #define REFT_OFFSET_MASK 0xfe00ULL
>  
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 062ec48a15..47cc82449b 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1682,8 +1682,18 @@ static int check_refcounts_l2(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  int csize;
>  l2_entry = get_l2_entry(s, l2_table, i);
>  uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i);
> +QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry);
>  
> -switch (qcow2_get_cluster_type(bs, l2_entry)) {
> +if (type != QCOW2_CLUSTER_COMPRESSED) {
> +/* Check reserved bits of Standard Cluster Descriptor */
> +if (l2_entry & L2E_STD_RESERVED_MASK) {
> +fprintf(stderr, "ERROR found l2 entry with reserved bits 
> set: "
> +"%" PRIx64, l2_entry);

'\n' is missed.

The rest is OK for me.

Tested-by: Kirill Tkhai 

> +res->corruptions++;
> +}
> +}
> +
> +switch (type) {
>  case QCOW2_CLUSTER_COMPRESSED:
>  /* Compressed clusters don't have QCOW_OFLAG_COPIED */
>  if (l2_entry & QCOW_OFLAG_COPIED) {
> 




Re: [PATCH v2 06/10] qcow2-refcount: check_refcounts_l2(): check l2_bitmap

2021-05-11 Thread Kirill Tkhai
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote:
> Check subcluster bitmap of the l2 entry for different types of
> clusters:
> 
>  - for compressed it must be zero
>  - for allocated check consistency of two parts of the bitmap
>  - for unallocated all subclusters should be unallocated
>(or zero-plain)
> 
> For unallocated clusters we can safely fix the entry by making it
> zero-plain.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Eric Blake 

Tested-by: Kirill Tkhai 

> ---
>  block/qcow2-refcount.c | 30 +-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index f48c5e1b5d..062ec48a15 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1681,6 +1681,7 @@ static int check_refcounts_l2(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  uint64_t coffset;
>  int csize;
>  l2_entry = get_l2_entry(s, l2_table, i);
> +uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i);
>  
>  switch (qcow2_get_cluster_type(bs, l2_entry)) {
>  case QCOW2_CLUSTER_COMPRESSED:
> @@ -1700,6 +1701,14 @@ static int check_refcounts_l2(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  break;
>  }
>  
> +if (l2_bitmap) {
> +fprintf(stderr, "ERROR compressed cluster %d with non-zero "
> +"subcluster allocation bitmap, entry=0x%" PRIx64 
> "\n",
> +i, l2_entry);
> +res->corruptions++;
> +break;
> +}
> +
>  /* Mark cluster as used */
>  qcow2_parse_compressed_l2_entry(bs, l2_entry, , );
>  ret = qcow2_inc_refcounts_imrt(
> @@ -1727,13 +1736,19 @@ static int check_refcounts_l2(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  {
>  uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>  
> +if ((l2_bitmap >> 32) & l2_bitmap) {
> +res->corruptions++;
> +fprintf(stderr, "ERROR offset=%" PRIx64 ": Allocated "
> +"cluster has corrupted subcluster allocation 
> bitmap\n",
> +offset);
> +}
> +
>  /* Correct offsets are cluster aligned */
>  if (offset_into_cluster(s, offset)) {
>  bool contains_data;
>  res->corruptions++;
>  
>  if (has_subclusters(s)) {
> -uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i);
>  contains_data = (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC);
>  } else {
>  contains_data = !(l2_entry & QCOW_OFLAG_ZERO);
> @@ -1800,6 +1815,19 @@ static int check_refcounts_l2(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  
>  case QCOW2_CLUSTER_ZERO_PLAIN:
>  case QCOW2_CLUSTER_UNALLOCATED:
> +if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) {
> +res->corruptions++;
> +fprintf(stderr, "%s: Unallocated "
> +"cluster has non-zero subcluster allocation map\n",
> +fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR");
> +if (fix & BDRV_FIX_ERRORS) {
> +ret = fix_l2_entry_by_zero(bs, res, l2_offset, l2_table, 
> i,
> +   active, _overlap);
> +if (metadata_overlap) {
> +return ret;
> +}
> +}
> +}
>  break;
>  
>  default:
> 




Re: [PATCH] qcow2: set bdi->is_dirty

2021-05-04 Thread Kirill Tkhai
On 04.05.2021 19:06, Vladimir Sementsov-Ogievskiy wrote:
> Set bdi->is_dirty, so that qemu-img info could show dirty flag.
> 
> After this commit the following check will show '"dirty-flag": true':
> 
> ./build/qemu-img create -f qcow2 -o lazy_refcounts=on x 1M
> ./build/qemu-io x
> qemu-io> write 0 1M
> 
>  After "write" command success, kill the qemu-io process:
> 
> kill -9 
> 
> ./build/qemu-img info --output=json x
> 
> This will show '"dirty-flag": true' among other things. (before this
> commit it shows '"dirty-flag": false')
> 
> Note, that qcow2's dirty-bit is not a "dirty bit for the image". It
> only protects qcow2 lazy refcounts feature. So, there are a lot of
> conditions when qcow2 session may be not closed correctly, but bit is
> 0. Still, when bit is set, the last session is definitely not finished
> correctly and it's better to report it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Works for me. Thanks,

Tested-by: Kirill Tkhai 

> ---
>  block/qcow2.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 9727ae8fe3..39b91ef940 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -5089,6 +5089,7 @@ static int qcow2_get_info(BlockDriverState *bs, 
> BlockDriverInfo *bdi)
>  BDRVQcow2State *s = bs->opaque;
>  bdi->cluster_size = s->cluster_size;
>  bdi->vm_state_offset = qcow2_vm_state_offset(s);
> +bdi->is_dirty = s->incompatible_features & QCOW2_INCOMPAT_DIRTY;
>  return 0;
>  }
>  
>