On 22.06.20 00:25, Nir Soffer wrote: > On Fri, Jun 19, 2020 at 1:40 PM Max Reitz <mre...@redhat.com> wrote: >> >> Hi, >> >> As discussed here: >> >> https://lists.nongnu.org/archive/html/qemu-block/2020-02/msg00644.html >> https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00329.html >> https://lists.nongnu.org/archive/html/qemu-block/2020-06/msg00240.html >> >> I think that qcow2 images with data-file-raw should always have >> preallocated 1:1 L1/L2 tables, so that the image always looks the same >> whether you respect or ignore the qcow2 metadata. > > I don't know the internals of qcow2 data_file, but are we really using > qcow2 metadata when accessing the data file?
Yes. > This may have unwanted performance consequences. I don’t think so, because in practice normal lookups of L1/L2 mappings generally don’t cost that much performance. > If I understand correctly, qcow2 metadata is needed only for keeping > bitmaps (or maybe > future extensions) for raw data file, and reading from the qcow2 image > should be read > directly from the raw file without any extra work. > > Writing to the data file should also bypass the qcow2 metadata, since the > bitmap > is updated in memory. Well, with this series, writing would no longer update the metadata at least, because it would always be preallocated already. >> The easiest way to >> achieve that is to enforce at least metadata preallocation whenever >> data-file-raw is given. > > But preallocation is not free, even on file systems, it can be even > slow (NFS < 4.2). Metadata preallocation with an external data file should be the same speed on every file system. We only need to create the metadata structures, which, with the default cluster size (64k) take up a bit more than 1/8192 of the full image size. Sure, it’s not free. But if we decide we should indeed fully ignore the L1/L2 tables for data-file-raw images, the qcow2 spec must be amended. As I can read it, it currently doesn’t say so. (By the way, this is not a trivial change. Right now, data-file-raw is an autoclear flag: If a version of qemu that doesn’t support it accesses the image, it will automatically clear the flag, but the image stays valid. If we decide to completely ignore the L1/L2 tables (i.e. not even create them), then this can no longer be an autoclear flag. We’d need a new incompatible flag. (Because without L1/L2 tables, the image becomes useless to older qemu versions.)) > With block storage this means you need to allocate the entire image size on > storage for writing the metadata. > > While oVirt does not use qcow2 with data_file, having preallocated qcow2 > will make this very hard to use, for example for 500 GiB disk we will have to > allocate 500 GiB disk for the raw data file and 500 GiB disk for the qcow2 > metadata disk which will be 99% unused. I don’t understand this. When you use an external data file, the qcow2 file will only contain the metadata: $ qemu-img create -f qcow2 \ -o data_file=foo.data,data_file_raw=on,preallocation=metadata \ foo.qcow2 8G Formatting 'foo.qcow2', fmt=qcow2 size=8589934592 data_file=foo.data data_file_raw=on cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16 $ ls -l foo.qcow2 ... 1310720 ... foo.qcow2 $ ls -l foo.data ... 8589934592 ... foo.data > I don't think that kubevirt is planning to use this either, but if > they decide to use > this it may be a problem for them as well when using block storage. > > It looks like we abuse preallocation for getting the side effect that > the backing file > will be rejected, instead of adding the validation rejecting backing > file in this case. That isn’t the case. I want to use preallocation because I interpret the spec such that it requires metadata preallocation. It says when accessing a qcow2 file with data-file-raw, you can ignore the L1/L2 tables. To me, that means that the L1/L2 tables must give a 1:1 mapping so that you get the same result whether you interpret them or not. Max
signature.asc
Description: OpenPGP digital signature