Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
Am 21.10.2014 um 18:26 hat Max Reitz geschrieben: > On 2014-10-20 at 16:48, Kevin Wolf wrote: > >Am 20.10.2014 um 16:39 hat Max Reitz geschrieben: > >>On 20.10.2014 at 16:25, Kevin Wolf wrote: > >>>Am 29.08.2014 um 23:40 hat Max Reitz geschrieben: > The size of a refblock entry is (in theory) variable; calculate > therefore the number of entries per refblock and the according bit shift > (1 << x == entry count) when opening an image. > > Signed-off-by: Max Reitz > --- > block/qcow2.c | 2 ++ > block/qcow2.h | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/block/qcow2.c b/block/qcow2.c > index f9e045f..172ad00 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict > *options, int flags, > s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ > s->l2_size = 1 << s->l2_bits; > +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); > +s->refcount_block_size = 1 << s->refcount_block_bits; > bs->total_sectors = header.size / 512; > s->csize_shift = (62 - (s->cluster_bits - 8)); > s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; > diff --git a/block/qcow2.h b/block/qcow2.h > index 6aeb7ea..7c01fb7 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { > int l2_size; > int l1_size; > int l1_vm_state_index; > +int refcount_block_bits; > +int refcount_block_size; > >>>Might just be me, but size sounds to me as if the unit were bytes. Would > >>>you mind renaming this as refcount_block_entries or refblock_entries? > >>If I'm doing a v7, no. Otherwise, well, see l1_size and l2_size. ;-) > >Good point. This has confused people more than once, so it's probably > >not just me. > > Okay, now that I've done it and was about to send the series and > just wanted to convert a local variable named refcount_table_size to > refcount_table_entries, I decided not do do this. I'll call it > refcount_block_size in v7 as well. > > The reason for this is that I started looking for "_size" in > block/qcow2-refcount.c. "_entries" is never used, the number of > entries per L1, L2 or refcount table is always foo_size. In > BDRVQcowState, there's not only l1_size and l2_size, but > refcount_table_size as well. Calling it refcount_block_entries > without renaming those would be extremely weird, and renaming those > does not seem like a viable option to me. Furthermore, I'd find it > extremely confusing to have "_entries" in some places and "_size" in > others when there's no difference between the two. Currently, people > ask "Why is this foo_size an entry count? ... Well, okay, that seems > to be just the way it is." With foo_entries, it'll be "Why is this > foo_size an entry count when bar_entries exists, so shouldn't it be > foo_entries if they want it to be an entry count?" > > I'll keep it as refcount_block_size, although I'm afraid reverting > all these changes will be as hard as having made them in the first > place... Oh, well, here goes. Fair enough. Perhaps I should prepare a series renaming some things in qcow2 once your current series are in (most importantly BDRVQcowState, which also exists in qcow1 with the same name and makes debugging with gdb harder than necessary...) Kevin
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
On 2014-10-20 at 16:48, Kevin Wolf wrote: Am 20.10.2014 um 16:39 hat Max Reitz geschrieben: On 20.10.2014 at 16:25, Kevin Wolf wrote: Am 29.08.2014 um 23:40 hat Max Reitz geschrieben: The size of a refblock entry is (in theory) variable; calculate therefore the number of entries per refblock and the according bit shift (1 << x == entry count) when opening an image. Signed-off-by: Max Reitz --- block/qcow2.c | 2 ++ block/qcow2.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index f9e045f..172ad00 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ s->l2_size = 1 << s->l2_bits; +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); +s->refcount_block_size = 1 << s->refcount_block_bits; bs->total_sectors = header.size / 512; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; diff --git a/block/qcow2.h b/block/qcow2.h index 6aeb7ea..7c01fb7 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { int l2_size; int l1_size; int l1_vm_state_index; +int refcount_block_bits; +int refcount_block_size; Might just be me, but size sounds to me as if the unit were bytes. Would you mind renaming this as refcount_block_entries or refblock_entries? If I'm doing a v7, no. Otherwise, well, see l1_size and l2_size. ;-) Good point. This has confused people more than once, so it's probably not just me. Okay, now that I've done it and was about to send the series and just wanted to convert a local variable named refcount_table_size to refcount_table_entries, I decided not do do this. I'll call it refcount_block_size in v7 as well. The reason for this is that I started looking for "_size" in block/qcow2-refcount.c. "_entries" is never used, the number of entries per L1, L2 or refcount table is always foo_size. In BDRVQcowState, there's not only l1_size and l2_size, but refcount_table_size as well. Calling it refcount_block_entries without renaming those would be extremely weird, and renaming those does not seem like a viable option to me. Furthermore, I'd find it extremely confusing to have "_entries" in some places and "_size" in others when there's no difference between the two. Currently, people ask "Why is this foo_size an entry count? ... Well, okay, that seems to be just the way it is." With foo_entries, it'll be "Why is this foo_size an entry count when bar_entries exists, so shouldn't it be foo_entries if they want it to be an entry count?" I'll keep it as refcount_block_size, although I'm afraid reverting all these changes will be as hard as having made them in the first place... Oh, well, here goes. Max
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
Am 29.08.2014 um 23:40 hat Max Reitz geschrieben: > The size of a refblock entry is (in theory) variable; calculate > therefore the number of entries per refblock and the according bit shift > (1 << x == entry count) when opening an image. > > Signed-off-by: Max Reitz > --- > block/qcow2.c | 2 ++ > block/qcow2.h | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/block/qcow2.c b/block/qcow2.c > index f9e045f..172ad00 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict > *options, int flags, > > s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ > s->l2_size = 1 << s->l2_bits; > +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); > +s->refcount_block_size = 1 << s->refcount_block_bits; > bs->total_sectors = header.size / 512; > s->csize_shift = (62 - (s->cluster_bits - 8)); > s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; > diff --git a/block/qcow2.h b/block/qcow2.h > index 6aeb7ea..7c01fb7 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { > int l2_size; > int l1_size; > int l1_vm_state_index; > +int refcount_block_bits; > +int refcount_block_size; Might just be me, but size sounds to me as if the unit were bytes. Would you mind renaming this as refcount_block_entries or refblock_entries? Kevin
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
On 20.10.2014 at 16:25, Kevin Wolf wrote: Am 29.08.2014 um 23:40 hat Max Reitz geschrieben: The size of a refblock entry is (in theory) variable; calculate therefore the number of entries per refblock and the according bit shift (1 << x == entry count) when opening an image. Signed-off-by: Max Reitz --- block/qcow2.c | 2 ++ block/qcow2.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index f9e045f..172ad00 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ s->l2_size = 1 << s->l2_bits; +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); +s->refcount_block_size = 1 << s->refcount_block_bits; bs->total_sectors = header.size / 512; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; diff --git a/block/qcow2.h b/block/qcow2.h index 6aeb7ea..7c01fb7 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { int l2_size; int l1_size; int l1_vm_state_index; +int refcount_block_bits; +int refcount_block_size; Might just be me, but size sounds to me as if the unit were bytes. Would you mind renaming this as refcount_block_entries or refblock_entries? If I'm doing a v7, no. Otherwise, well, see l1_size and l2_size. ;-) Max
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
Am 20.10.2014 um 16:39 hat Max Reitz geschrieben: > On 20.10.2014 at 16:25, Kevin Wolf wrote: > >Am 29.08.2014 um 23:40 hat Max Reitz geschrieben: > >>The size of a refblock entry is (in theory) variable; calculate > >>therefore the number of entries per refblock and the according bit shift > >>(1 << x == entry count) when opening an image. > >> > >>Signed-off-by: Max Reitz > >>--- > >> block/qcow2.c | 2 ++ > >> block/qcow2.h | 2 ++ > >> 2 files changed, 4 insertions(+) > >> > >>diff --git a/block/qcow2.c b/block/qcow2.c > >>index f9e045f..172ad00 100644 > >>--- a/block/qcow2.c > >>+++ b/block/qcow2.c > >>@@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict > >>*options, int flags, > >> s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ > >> s->l2_size = 1 << s->l2_bits; > >>+s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); > >>+s->refcount_block_size = 1 << s->refcount_block_bits; > >> bs->total_sectors = header.size / 512; > >> s->csize_shift = (62 - (s->cluster_bits - 8)); > >> s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; > >>diff --git a/block/qcow2.h b/block/qcow2.h > >>index 6aeb7ea..7c01fb7 100644 > >>--- a/block/qcow2.h > >>+++ b/block/qcow2.h > >>@@ -222,6 +222,8 @@ typedef struct BDRVQcowState { > >> int l2_size; > >> int l1_size; > >> int l1_vm_state_index; > >>+int refcount_block_bits; > >>+int refcount_block_size; > >Might just be me, but size sounds to me as if the unit were bytes. Would > >you mind renaming this as refcount_block_entries or refblock_entries? > > If I'm doing a v7, no. Otherwise, well, see l1_size and l2_size. ;-) Good point. This has confused people more than once, so it's probably not just me. Kevin
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
Am 10.10.2014 um 14:29 schrieb Benoît Canet: On Fri, Aug 29, 2014 at 11:40:53PM +0200, Max Reitz wrote: The size of a refblock entry is (in theory) variable; calculate therefore the number of entries per refblock and the according bit shift (1 << x == entry count) when opening an image. Signed-off-by: Max Reitz --- block/qcow2.c | 2 ++ block/qcow2.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index f9e045f..172ad00 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ s->l2_size = 1 << s->l2_bits; +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); After carefull examination (s->refcount_order - 3) == REFCOUNT_SHIFT. Why not also creating s->refcount_shift and make use of it in order to avoid torturing the mind of the next reader. I'm sorry. *g* Well, I'm against creating s->refcount_shift, because that name implies you could use it for shifts. But you shouldn't do that, because it may be both negative and positive. Or simply recall in a comment that there is 2^3 bits in a byte. I like this more. :-) Max Best regards Benoît +s->refcount_block_size = 1 << s->refcount_block_bits; bs->total_sectors = header.size / 512; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; diff --git a/block/qcow2.h b/block/qcow2.h index 6aeb7ea..7c01fb7 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { int l2_size; int l1_size; int l1_vm_state_index; +int refcount_block_bits; +int refcount_block_size; int csize_shift; int csize_mask; uint64_t cluster_offset_mask; -- 2.1.0
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
On Fri, Aug 29, 2014 at 11:40:53PM +0200, Max Reitz wrote: > The size of a refblock entry is (in theory) variable; calculate > therefore the number of entries per refblock and the according bit shift > (1 << x == entry count) when opening an image. > > Signed-off-by: Max Reitz > --- > block/qcow2.c | 2 ++ > block/qcow2.h | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/block/qcow2.c b/block/qcow2.c > index f9e045f..172ad00 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict > *options, int flags, > > s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ > s->l2_size = 1 << s->l2_bits; > +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); After carefull examination (s->refcount_order - 3) == REFCOUNT_SHIFT. Why not also creating s->refcount_shift and make use of it in order to avoid torturing the mind of the next reader. Or simply recall in a comment that there is 2^3 bits in a byte. Best regards Benoît > +s->refcount_block_size = 1 << s->refcount_block_bits; > bs->total_sectors = header.size / 512; > s->csize_shift = (62 - (s->cluster_bits - 8)); > s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; > diff --git a/block/qcow2.h b/block/qcow2.h > index 6aeb7ea..7c01fb7 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { > int l2_size; > int l1_size; > int l1_vm_state_index; > +int refcount_block_bits; > +int refcount_block_size; > int csize_shift; > int csize_mask; > uint64_t cluster_offset_mask; > -- > 2.1.0 >
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
On 30.08.2014 01:03, Eric Blake wrote: On 08/29/2014 03:40 PM, Max Reitz wrote: The size of a refblock entry is (in theory) variable; calculate therefore the number of entries per refblock and the according bit shift (1 << x == entry count) when opening an image. Signed-off-by: Max Reitz --- block/qcow2.c | 2 ++ block/qcow2.h | 2 ++ 2 files changed, 4 insertions(+) What is the maximum refcount_order? The specs don't mention; the file format is wide open to overflows. Even something as benign-sounding as refcount_order=6 (64 bits) means that each cluster can be referenced 2**64 times, which is far longer than our lifetimes to build it up that high incrementally, and represents far greater than the amount of storage in existence being deduplicated! Shockingly easy to start getting into undefined territory, so maybe we ought to explicitly cap refcount_order to 6. Well, the most obvious issue to me is that qcow2 only supports 64 bit offsets and sizes etc., so it shouldn't have refcounts wider than 64 bits. On the other hand, it is probably possible to generate a valid image with a cluster having a refcount which exceeds 2^{64} - 1: Set the virtual size to (2^{64} - cluster_size), use a single data cluster for all virtual clusters which makes its refcount (2^{64} - cluster_size) / cluster_size (which would be 2^{55} - 1 in the most extreme case). Then you create cluster_size snapshots and the refcount of that cluster is now at (2^{55} - 1) * (1 + 512) = 2^{64} + 2^{55} - 512 - 1 >= 2^{64}. But that would be crazy, so I think it very reasonable to forbid refcount_order > 6, too. diff --git a/block/qcow2.c b/block/qcow2.c index f9e045f..172ad00 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ s->l2_size = 1 << s->l2_bits; +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); Hmm; we document that qemu requires cluster_bits to be between 9 and 21 inclusive. So, if cluster_bits is 9 (512-byte clusters), and refcount_order is 6, then we can pack in 9 - (6 - 3) or 2**6 (that is, 64) refcounts per cluster. On the other extreme, the minimum refcount_order of 0 (each cluster occupies refcount bits, and so is either allocated or not, but no sharing), starts having the math looks ugly, because you are mixing: (int) = (uint32_t) - ( (uint32_t) - (int) ) so at one point, you are doing s->cluster_bits - (4294967293U), but that wraps around (thankfully, wraparound is well-defined on unsigned types) for a net answer of cluster_bits + 3. But in the worst case, that means an image with 2M clusters will be packing 21 - (0 - 3) or 2**24 (that is, 16M) refcounts in one cluster. Still fits in an int, so it looks like you are safe... +s->refcount_block_size = 1 << s->refcount_block_bits; ...that this particular shift will not cause undefined behavior, for reasonable refcount_order in the range [0,6]. Well, for now refcount_order is asserted to be 4, anyway. ;-) Reviewed-by: Eric Blake [We really ought to tighten the qcow2 spec - but that's a separate patch] Yep, I'll include it in v2 of the follow-up series. Max
Re: [Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
On 08/29/2014 03:40 PM, Max Reitz wrote: > The size of a refblock entry is (in theory) variable; calculate > therefore the number of entries per refblock and the according bit shift > (1 << x == entry count) when opening an image. > > Signed-off-by: Max Reitz > --- > block/qcow2.c | 2 ++ > block/qcow2.h | 2 ++ > 2 files changed, 4 insertions(+) What is the maximum refcount_order? The specs don't mention; the file format is wide open to overflows. Even something as benign-sounding as refcount_order=6 (64 bits) means that each cluster can be referenced 2**64 times, which is far longer than our lifetimes to build it up that high incrementally, and represents far greater than the amount of storage in existence being deduplicated! Shockingly easy to start getting into undefined territory, so maybe we ought to explicitly cap refcount_order to 6. > > diff --git a/block/qcow2.c b/block/qcow2.c > index f9e045f..172ad00 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict > *options, int flags, > > s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ > s->l2_size = 1 << s->l2_bits; > +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); Hmm; we document that qemu requires cluster_bits to be between 9 and 21 inclusive. So, if cluster_bits is 9 (512-byte clusters), and refcount_order is 6, then we can pack in 9 - (6 - 3) or 2**6 (that is, 64) refcounts per cluster. On the other extreme, the minimum refcount_order of 0 (each cluster occupies refcount bits, and so is either allocated or not, but no sharing), starts having the math looks ugly, because you are mixing: (int) = (uint32_t) - ( (uint32_t) - (int) ) so at one point, you are doing s->cluster_bits - (4294967293U), but that wraps around (thankfully, wraparound is well-defined on unsigned types) for a net answer of cluster_bits + 3. But in the worst case, that means an image with 2M clusters will be packing 21 - (0 - 3) or 2**24 (that is, 16M) refcounts in one cluster. Still fits in an int, so it looks like you are safe... > +s->refcount_block_size = 1 << s->refcount_block_bits; ...that this particular shift will not cause undefined behavior, for reasonable refcount_order in the range [0,6]. Reviewed-by: Eric Blake [We really ought to tighten the qcow2 spec - but that's a separate patch] -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v5 01/11] qcow2: Calculate refcount block entry count
The size of a refblock entry is (in theory) variable; calculate therefore the number of entries per refblock and the according bit shift (1 << x == entry count) when opening an image. Signed-off-by: Max Reitz --- block/qcow2.c | 2 ++ block/qcow2.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index f9e045f..172ad00 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -689,6 +689,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */ s->l2_size = 1 << s->l2_bits; +s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); +s->refcount_block_size = 1 << s->refcount_block_bits; bs->total_sectors = header.size / 512; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; diff --git a/block/qcow2.h b/block/qcow2.h index 6aeb7ea..7c01fb7 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -222,6 +222,8 @@ typedef struct BDRVQcowState { int l2_size; int l1_size; int l1_vm_state_index; +int refcount_block_bits; +int refcount_block_size; int csize_shift; int csize_mask; uint64_t cluster_offset_mask; -- 2.1.0