On 24.04.20 20:47, Eric Blake wrote:
> On 4/24/20 1:37 PM, Alberto Garcia wrote:
>> On Fri 24 Apr 2020 08:25:45 PM CEST, Vladimir Sementsov-Ogievskiy
>> <vsement...@virtuozzo.com> wrote:
>>>>> Reading the entire cluster will be interesting - you'll have to
>>>>> decompress the entire memory, then overwrite the zeroed portions.
>>>>
>>>> I don't think so, qcow2_get_host_offset() would detect the number of
>>>> contiguous subclusters of the same type at the given offset. In this
>>>> case they would be _ZERO subclusters so there's no need to decompress
>>>> anything, or even read it (it works the same with uncompressed
>>>> clusters).
>>>
>>> But if at least one of subclusters to read is not _ZERO, you'll have
>>> to decompress the whole cluster, and after decompression rewrite
>>> zero-subclusters by zeroes, as Eric says.. Or I lost the thread:)
>>
>> I don't see why you would need to rewrite anything... you do have to
>> decompress the whole cluster, and the uncompressed cluster in memory
>> would have stale data, but you never need to use that data for anything,
>> let alone to return it to the guest.
>>
>> Even if there's a COW, the new cluster would inherit the compressed
>> cluster's bitmap so the zeroized subclusters still read as zeroes.
>>
>> It's the same with normal clusters, 'write -P 0xff 0 64k' followed by
>> 'write -z 16k 16k'. The host cluster on disk still reads as 0xff but the
>> L2 entry indicates that part of it is just zeroes.
> 
> The point is this:  Consider 'write -P 0xff 0 64k', then 'write -z 16k
> 16k', then 'read 0 64k'.  For normal clusters, we can just do a
> scatter-gather iov read of read 0-16k and 32-64k, plus a memset of
> 16-32k.  But for compressed clusters, we have to read and decompress the
> entire 64k, AND also memset 16k-32k.  But if zeroing after reading is
> not that expensive, then the same technique for normal clusters is fine
> (instead of a scatter-gather read of 48k, just read the whole 64k
> cluster before doing the memset).

It would also mean letting qcow2_co_preadv_part() special-handle such
cases, i.e., whenever the whole clusters is compressed, it needs to read
it as a whole, regardless of the subcluster status, and then memset()
all areas to zero that are all-zero subclusters.  Otherwise we’d read
and decompress the whole buffer twice (once for 0 to 16k, once for 32k
to 64k).

This may be complicated a bit by the task schema, i.e. that reads are
scheduled in the task pool.  For qcow2_co_preadv_part() to memset some
area after decompression, it would need to wait on the read_compressed
task, which would make the whole task pool thing moot (for compressed
clusters).  Or it just does the memset() at the end, when we have to
settle the task pool anyway, but then it would have to remember all
areas it still needs to zero.

Hm, or, qcow2_co_preadv_compresed() could figure out where the zeroed
subclusters are and then memset() them itself, e.g. by receiving the
subcluster bitmap.  Probably the simplest implementation, but it seems a
bit like a layering breach.

Not sure how bad the complexity is on the write side for not letting
zero writes just zero the subcluster, but it doesn’t seem to me that the
opposite would come for free on the read side.

Max

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to