Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-22 Thread Eric Blake

On 4/22/20 10:21 AM, Kevin Wolf wrote:

If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
undo any previous preallocation, but just adds the zero flag to all
relevant L2 entries. If an external data file is in use, a write_zeroes
request to the data file is made instead.

Signed-off-by: Kevin Wolf 
---
  block/qcow2.c | 30 ++
  1 file changed, 30 insertions(+)




@@ -4214,6 +4215,35 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
  g_assert_not_reached();
  }
  
+if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {

+uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
+uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);


This rounds up beyond the new size...


+
+/* Use zero clusters as much as we can */
+ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 0);


and then requests that the extra be zeroed.  Does that always work, even 
when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file? 
 If so,


Reviewed-by: Eric Blake 

otherwise, you may have to treat the tail specially, the same way you 
treated an unaligned head.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-22 Thread Kevin Wolf
Am 22.04.2020 um 17:33 hat Eric Blake geschrieben:
> On 4/22/20 10:21 AM, Kevin Wolf wrote:
> > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
> > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
> > undo any previous preallocation, but just adds the zero flag to all
> > relevant L2 entries. If an external data file is in use, a write_zeroes
> > request to the data file is made instead.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >   block/qcow2.c | 30 ++
> >   1 file changed, 30 insertions(+)
> > 
> 
> > @@ -4214,6 +4215,35 @@ static int coroutine_fn 
> > qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
> >   g_assert_not_reached();
> >   }
> > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
> > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
> > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);
> 
> This rounds up beyond the new size...
> 
> > +
> > +/* Use zero clusters as much as we can */
> > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 
> > 0);
> 
> and then requests that the extra be zeroed.  Does that always work, even
> when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file?

You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's
currently not a code path that is run because we only set
BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and
data_file_is_raw() doesn't work with backing files.

But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE
for such a file, I think it would fail.

> If so,
> 
> Reviewed-by: Eric Blake 
> 
> otherwise, you may have to treat the tail specially, the same way you
> treated an unaligned head.

Actually, do I even need to round the tail?

/* Caller must pass aligned values, except at image end */
assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
   end_offset == bs->total_sectors << BDRV_SECTOR_BITS);

So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would
still set the zero flag for the partial last cluster and for the
external data file, bdrv_co_pwrite_zeroes() would have the correct size.

Kevin




Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-22 Thread Eric Blake

On 4/22/20 10:58 AM, Kevin Wolf wrote:


@@ -4214,6 +4215,35 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
   g_assert_not_reached();
   }
+if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
+uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
+uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);


This rounds up beyond the new size...


+
+/* Use zero clusters as much as we can */
+ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 0);


and then requests that the extra be zeroed.  Does that always work, even
when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file?


You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's
currently not a code path that is run because we only set
BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and
data_file_is_raw() doesn't work with backing files.


Good point.



But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE
for such a file, I think it would fail.


If so,

Reviewed-by: Eric Blake 

otherwise, you may have to treat the tail specially, the same way you
treated an unaligned head.


Actually, do I even need to round the tail?

 /* Caller must pass aligned values, except at image end */
 assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
 assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
end_offset == bs->total_sectors << BDRV_SECTOR_BITS);

So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would
still set the zero flag for the partial last cluster and for the
external data file, bdrv_co_pwrite_zeroes() would have the correct size.


Then I'm in favor of NOT rounding the tail.  That's an easy enough 
change and we've now justified that it does what we want, so R-b stands 
with that one-line tweak.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-23 Thread Max Reitz
On 22.04.20 17:21, Kevin Wolf wrote:
> If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
> qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
> undo any previous preallocation, but just adds the zero flag to all
> relevant L2 entries. If an external data file is in use, a write_zeroes
> request to the data file is made instead.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/qcow2.c | 30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 9cfbdfc939..bd632405d1 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c

[...]

> @@ -4214,6 +4215,35 @@ static int coroutine_fn 
> qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>  g_assert_not_reached();
>  }
>  
> +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
> +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
> +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);
> +
> +/* Use zero clusters as much as we can */
> +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 
> 0);

It’s kind of a pity that this changes the cluster mappings that were
established when using falloc/full preallocation already (i.e., they
become preallocated zero clusters then, so when writing to them, we need
COW again).

But falloc/full preallocation do not guarantee that the new data is
zero, so I suppose this is the only thing we can reasonably do.

I personally don’t really care about whether zero_end is aligned or not,
so either way:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-23 Thread Kevin Wolf
Am 22.04.2020 um 18:14 hat Eric Blake geschrieben:
> On 4/22/20 10:58 AM, Kevin Wolf wrote:
> 
> > > > @@ -4214,6 +4215,35 @@ static int coroutine_fn 
> > > > qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
> > > >g_assert_not_reached();
> > > >}
> > > > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
> > > > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, 
> > > > s->cluster_size);
> > > > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);
> > > 
> > > This rounds up beyond the new size...
> > > 
> > > > +
> > > > +/* Use zero clusters as much as we can */
> > > > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - 
> > > > zero_start, 0);
> > > 
> > > and then requests that the extra be zeroed.  Does that always work, even
> > > when it results in pdrv_co_pwrite_zeroes beyond the end of s->data_file?
> > 
> > You mean the data_file_is_raw() path in qcow2_cluster_zeroize()? It's
> > currently not a code path that is run because we only set
> > BDRV_REQ_ZERO_WRITE for truncate if the image has a backing file, and
> > data_file_is_raw() doesn't work with backing files.
> 
> Good point.
> 
> > 
> > But hypothetically, if someone called truncate with BDRV_REQ_ZERO_WRITE
> > for such a file, I think it would fail.
> > 
> > > If so,
> > > 
> > > Reviewed-by: Eric Blake 
> > > 
> > > otherwise, you may have to treat the tail specially, the same way you
> > > treated an unaligned head.
> > 
> > Actually, do I even need to round the tail?
> > 
> >  /* Caller must pass aligned values, except at image end */
> >  assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
> >  assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
> > end_offset == bs->total_sectors << BDRV_SECTOR_BITS);
> > 
> > So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would
> > still set the zero flag for the partial last cluster and for the
> > external data file, bdrv_co_pwrite_zeroes() would have the correct size.
> 
> Then I'm in favor of NOT rounding the tail.  That's an easy enough change
> and we've now justified that it does what we want, so R-b stands with that
> one-line tweak.

Would have been too easy... bs->total_sectors isn't updated yet, so the
assertion does fail.

I can make the assertion check end_offset >= ... instead. That should
still check what we wanted to check here and allow the unaligned
extension.

This feels like the better option to me compared to updating
bs->total_sectors earlier and then undoing that change in every error
path.

Kevin




Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-23 Thread Kevin Wolf
Am 23.04.2020 um 12:53 hat Max Reitz geschrieben:
> On 22.04.20 17:21, Kevin Wolf wrote:
> > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
> > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
> > undo any previous preallocation, but just adds the zero flag to all
> > relevant L2 entries. If an external data file is in use, a write_zeroes
> > request to the data file is made instead.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block/qcow2.c | 30 ++
> >  1 file changed, 30 insertions(+)
> > 
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 9cfbdfc939..bd632405d1 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> 
> [...]
> 
> > @@ -4214,6 +4215,35 @@ static int coroutine_fn 
> > qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
> >  g_assert_not_reached();
> >  }
> >  
> > +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
> > +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
> > +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);
> > +
> > +/* Use zero clusters as much as we can */
> > +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 
> > 0);
> 
> It’s kind of a pity that this changes the cluster mappings that were
> established when using falloc/full preallocation already (i.e., they
> become preallocated zero clusters then, so when writing to them, we need
> COW again).
> 
> But falloc/full preallocation do not guarantee that the new data is
> zero, so I suppose this is the only thing we can reasonably do.

If we really want, I guess we could make full preallocation first try
passing BDRV_REQ_ZERO_WRITE to the protocol layer and if that succeeds,
we could skip setting the zero cluster flag at the qcow2 level.

Feels like a separate patch, though.

Kevin


signature.asc
Description: PGP signature


Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-23 Thread Max Reitz
On 23.04.20 15:25, Kevin Wolf wrote:
> Am 23.04.2020 um 12:53 hat Max Reitz geschrieben:
>> On 22.04.20 17:21, Kevin Wolf wrote:
>>> If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
>>> qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
>>> undo any previous preallocation, but just adds the zero flag to all
>>> relevant L2 entries. If an external data file is in use, a write_zeroes
>>> request to the data file is made instead.
>>>
>>> Signed-off-by: Kevin Wolf 
>>> ---
>>>  block/qcow2.c | 30 ++
>>>  1 file changed, 30 insertions(+)
>>>
>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>> index 9cfbdfc939..bd632405d1 100644
>>> --- a/block/qcow2.c
>>> +++ b/block/qcow2.c
>>
>> [...]
>>
>>> @@ -4214,6 +4215,35 @@ static int coroutine_fn 
>>> qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>>>  g_assert_not_reached();
>>>  }
>>>  
>>> +if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
>>> +uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
>>> +uint64_t zero_end = QEMU_ALIGN_UP(offset, s->cluster_size);
>>> +
>>> +/* Use zero clusters as much as we can */
>>> +ret = qcow2_cluster_zeroize(bs, zero_start, zero_end - zero_start, 
>>> 0);
>>
>> It’s kind of a pity that this changes the cluster mappings that were
>> established when using falloc/full preallocation already (i.e., they
>> become preallocated zero clusters then, so when writing to them, we need
>> COW again).
>>
>> But falloc/full preallocation do not guarantee that the new data is
>> zero, so I suppose this is the only thing we can reasonably do.
> 
> If we really want, I guess we could make full preallocation first try
> passing BDRV_REQ_ZERO_WRITE to the protocol layer and if that succeeds,
> we could skip setting the zero cluster flag at the qcow2 level.

That might be nice.

> Feels like a separate patch, though.

Definitely, yes.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v5 4/9] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate

2020-04-23 Thread Eric Blake

On 4/23/20 8:23 AM, Kevin Wolf wrote:



So qcow2_cluster_zeroize() seems to accept the unaligned tail. It would
still set the zero flag for the partial last cluster and for the
external data file, bdrv_co_pwrite_zeroes() would have the correct size.


Then I'm in favor of NOT rounding the tail.  That's an easy enough change
and we've now justified that it does what we want, so R-b stands with that
one-line tweak.


Would have been too easy... bs->total_sectors isn't updated yet, so the
assertion does fail.

I can make the assertion check end_offset >= ... instead. That should
still check what we wanted to check here and allow the unaligned
extension.


Yes, that works for me.



This feels like the better option to me compared to updating
bs->total_sectors earlier and then undoing that change in every error
path.


Indeed.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org