Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
On 4/22/20 10:21 AM, Kevin Wolf wrote: If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't undo any previous preallocation, but just adds the zero flag to all relevant L2 entries. If an external data file is in use, a write_zeroes request to the data file is made instead. Signed-off-by: Kevin Wolf --- block/qcow2.c | 30 ++ 1 file changed, 30 insertions(+) @@ -4214,6 +4215,35 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, g_assert_not_reached(); } +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); This rounds up beyond the new size... + +/* Use zero clusters as much as we can */ +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 0); and then requests that the extra be zeroed. Does that always work, even when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file? If so, Reviewed-by: Eric Blake otherwise, you may have to treat the tail specially, the same way you treated an unaligned head. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
Am 22.04.2020 um 17:33 hat Eric Blake geschrieben: > On 4/22/20 10:21 AM, Kevin Wolf wrote: > > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling > > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't > > undo any previous preallocation, but just adds the zero flag to all > > relevant L2 entries. If an external data file is in use, a write_zeroes > > request to the data file is made instead. > > > > Signed-off-by: Kevin Wolf > > --- > > block/qcow2.c | 30 ++ > > 1 file changed, 30 insertions(+) > > > > > @@ -4214,6 +4215,35 @@ static int coroutine_fn > > qcow2_co_truncate(BlockDriverState *bs, int64_t offset, > > g_assert_not_reached(); > > } > > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { > > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); > > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); > > This rounds up beyond the new size... > > > + > > +/* Use zero clusters as much as we can */ > > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, > > 0); > > and then requests that the extra be zeroed. Does that always work, even > when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file? You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's currently not a code path that is run because we only set BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and data_file_is_raw() doesn't work with backing files. But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE for such a file, I think it would fail. > If so, > > Reviewed-by: Eric Blake > > otherwise, you may have to treat the tail specially, the same way you > treated an unaligned head. Actually, do I even need to round the tail? /* Caller must pass aligned values, except at image end */ assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) || end_offset == bs->total_sectors << BDRV_SECTOR_BITS); So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would still set the zero flag for the partial last cluster and for the external data file, bdrv_co_pwrite_zeroes() would have the correct size. Kevin
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
On 4/22/20 10:58 AM, Kevin Wolf wrote: @@ -4214,6 +4215,35 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, g_assert_not_reached(); } +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); This rounds up beyond the new size... + +/* Use zero clusters as much as we can */ +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 0); and then requests that the extra be zeroed. Does that always work, even when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file? You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's currently not a code path that is run because we only set BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and data_file_is_raw() doesn't work with backing files. Good point. But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE for such a file, I think it would fail. If so, Reviewed-by: Eric Blake otherwise, you may have to treat the tail specially, the same way you treated an unaligned head. Actually, do I even need to round the tail? /* Caller must pass aligned values, except at image end */ assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) || end_offset == bs->total_sectors << BDRV_SECTOR_BITS); So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would still set the zero flag for the partial last cluster and for the external data file, bdrv_co_pwrite_zeroes() would have the correct size. Then I'm in favor of NOT rounding the tail. That's an easy enough change and we've now justified that it does what we want, so R-b stands with that one-line tweak. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
On 22.04.20 17:21, Kevin Wolf wrote: > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't > undo any previous preallocation, but just adds the zero flag to all > relevant L2 entries. If an external data file is in use, a write_zeroes > request to the data file is made instead. > > Signed-off-by: Kevin Wolf > --- > block/qcow2.c | 30 ++ > 1 file changed, 30 insertions(+) > > diff --git a/block/qcow2.c b/block/qcow2.c > index 9cfbdfc939..bd632405d1 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c [...] > @@ -4214,6 +4215,35 @@ static int coroutine_fn > qcow2_co_truncate(BlockDriverState *bs, int64_t offset, > g_assert_not_reached(); > } > > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); > + > +/* Use zero clusters as much as we can */ > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, > 0); It’s kind of a pity that this changes the cluster mappings that were established when using falloc/full preallocation already (i.e., they become preallocated zero clusters then, so when writing to them, we need COW again). But falloc/full preallocation do not guarantee that the new data is zero, so I suppose this is the only thing we can reasonably do. I personally don’t really care about whether zero_end is aligned or not, so either way: Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
Am 22.04.2020 um 18:14 hat Eric Blake geschrieben: > On 4/22/20 10:58 AM, Kevin Wolf wrote: > > > > > @@ -4214,6 +4215,35 @@ static int coroutine_fn > > > > qcow2_co_truncate(BlockDriverState *bs, int64_t offset, > > > >g_assert_not_reached(); > > > >} > > > > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { > > > > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, > > > > s->cluster_size); > > > > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); > > > > > > This rounds up beyond the new size... > > > > > > > + > > > > +/* Use zero clusters as much as we can */ > > > > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - > > > > zero_start, 0); > > > > > > and then requests that the extra be zeroed. Does that always work, even > > > when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file? > > > > You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's > > currently not a code path that is run because we only set > > BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and > > data_file_is_raw() doesn't work with backing files. > > Good point. > > > > > But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE > > for such a file, I think it would fail. > > > > > If so, > > > > > > Reviewed-by: Eric Blake > > > > > > otherwise, you may have to treat the tail specially, the same way you > > > treated an unaligned head. > > > > Actually, do I even need to round the tail? > > > > /* Caller must pass aligned values, except at image end */ > > assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); > > assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) || > > end_offset == bs->total_sectors << BDRV_SECTOR_BITS); > > > > So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would > > still set the zero flag for the partial last cluster and for the > > external data file, bdrv_co_pwrite_zeroes() would have the correct size. > > Then I'm in favor of NOT rounding the tail. That's an easy enough change > and we've now justified that it does what we want, so R-b stands with that > one-line tweak. Would have been too easy... bs->total_sectors isn't updated yet, so the assertion does fail. I can make the assertion check end_offset >= ... instead. That should still check what we wanted to check here and allow the unaligned extension. This feels like the better option to me compared to updating bs->total_sectors earlier and then undoing that change in every error path. Kevin
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
Am 23.04.2020 um 12:53 hat Max Reitz geschrieben: > On 22.04.20 17:21, Kevin Wolf wrote: > > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling > > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't > > undo any previous preallocation, but just adds the zero flag to all > > relevant L2 entries. If an external data file is in use, a write_zeroes > > request to the data file is made instead. > > > > Signed-off-by: Kevin Wolf > > --- > > block/qcow2.c | 30 ++ > > 1 file changed, 30 insertions(+) > > > > diff --git a/block/qcow2.c b/block/qcow2.c > > index 9cfbdfc939..bd632405d1 100644 > > --- a/block/qcow2.c > > +++ b/block/qcow2.c > > [...] > > > @@ -4214,6 +4215,35 @@ static int coroutine_fn > > qcow2_co_truncate(BlockDriverState *bs, int64_t offset, > > g_assert_not_reached(); > > } > > > > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { > > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); > > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); > > + > > +/* Use zero clusters as much as we can */ > > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, > > 0); > > It’s kind of a pity that this changes the cluster mappings that were > established when using falloc/full preallocation already (i.e., they > become preallocated zero clusters then, so when writing to them, we need > COW again). > > But falloc/full preallocation do not guarantee that the new data is > zero, so I suppose this is the only thing we can reasonably do. If we really want, I guess we could make full preallocation first try passing BDRV_REQ_ZERO_WRITE to the protocol layer and if that succeeds, we could skip setting the zero cluster flag at the qcow2 level. Feels like a separate patch, though. Kevin signature.asc Description: PGP signature
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
On 23.04.20 15:25, Kevin Wolf wrote: > Am 23.04.2020 um 12:53 hat Max Reitz geschrieben: >> On 22.04.20 17:21, Kevin Wolf wrote: >>> If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling >>> qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't >>> undo any previous preallocation, but just adds the zero flag to all >>> relevant L2 entries. If an external data file is in use, a write_zeroes >>> request to the data file is made instead. >>> >>> Signed-off-by: Kevin Wolf >>> --- >>> block/qcow2.c | 30 ++ >>> 1 file changed, 30 insertions(+) >>> >>> diff --git a/block/qcow2.c b/block/qcow2.c >>> index 9cfbdfc939..bd632405d1 100644 >>> --- a/block/qcow2.c >>> +++ b/block/qcow2.c >> >> [...] >> >>> @@ -4214,6 +4215,35 @@ static int coroutine_fn >>> qcow2_co_truncate(BlockDriverState *bs, int64_t offset, >>> g_assert_not_reached(); >>> } >>> >>> +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) { >>> +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size); >>> +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size); >>> + >>> +/* Use zero clusters as much as we can */ >>> +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, >>> 0); >> >> It’s kind of a pity that this changes the cluster mappings that were >> established when using falloc/full preallocation already (i.e., they >> become preallocated zero clusters then, so when writing to them, we need >> COW again). >> >> But falloc/full preallocation do not guarantee that the new data is >> zero, so I suppose this is the only thing we can reasonably do. > > If we really want, I guess we could make full preallocation first try > passing BDRV_REQ_ZERO_WRITE to the protocol layer and if that succeeds, > we could skip setting the zero cluster flag at the qcow2 level. That might be nice. > Feels like a separate patch, though. Definitely, yes. Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
On 4/23/20 8:23 AM, Kevin Wolf wrote: So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would still set the zero flag for the partial last cluster and for the external data file, bdrv_co_pwrite_zeroes() would have the correct size. Then I'm in favor of NOT rounding the tail. That's an easy enough change and we've now justified that it does what we want, so R-b stands with that one-line tweak. Would have been too easy... bs->total_sectors isn't updated yet, so the assertion does fail. I can make the assertion check end_offset >= ... instead. That should still check what we wanted to check here and allow the unaligned extension. Yes, that works for me. This feels like the better option to me compared to updating bs->total_sectors earlier and then undoing that change in every error path. Indeed. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org