On Mon, Jun 22, 2020 at 6:07 PM Max Reitz <mre...@redhat.com> wrote: > > On 22.06.20 16:46, Alberto Garcia wrote: > > On Mon 22 Jun 2020 11:35:59 AM CEST, Max Reitz wrote: > >>>> + if (qcow2_opts->data_file_raw && > >>>> + qcow2_opts->preallocation == PREALLOC_MODE_OFF) > >>>> + { > >>>> + /* > >>>> + * data-file-raw means that "the external data file can be > >>>> + * read as a consistent standalone raw image without looking > >>>> + * at the qcow2 metadata." It does not say that the metadata > >>>> + * must be ignored, though (and the qcow2 driver in fact does > >>>> + * not ignore it), so the L1/L2 tables must be present and > >>>> + * give a 1:1 mapping, so you get the same result regardless > >>>> + * of whether you look at the metadata or whether you ignore > >>>> + * it. > >>>> + */ > >>>> + qcow2_opts->preallocation = PREALLOC_MODE_METADATA; > >>> > >>> I'm not convinced by this, > >> > >> Why not? > >> > >> This is how I read the spec. Furthermore, I see two problems that we > >> have right now that are fixed by this patch (namely (1) using a device > >> file as the external data file, which may have non-zero data at > >> creation; and (2) assigning a backing file at runtime must not show > >> the data). > > > > What happens if you first create the image (which would be preallocated > > with this patch), then you resize it and finally you assign the backing > > file? Would the resized part be preallocated? > > Good point, when resizing an image with data-file-raw we also need to > preallocate the L2 tables. > > >>> but your comment made me think of another possible alternative: in > >>> qcow2_get_cluster_offset(), if the cluster is unallocated and we are > >>> using a raw data file then we return _ZERO_PLAIN: > >>> > >>> --- a/block/qcow2-cluster.c > >>> +++ b/block/qcow2-cluster.c > >>> @@ -654,6 +654,10 @@ out: > >>> assert(bytes_available - offset_in_cluster <= UINT_MAX); > >>> *bytes = bytes_available - offset_in_cluster; > >>> > >>> + if (type == QCOW2_CLUSTER_UNALLOCATED && data_file_is_raw(bs)) { > >>> + type = QCOW2_CLUSTER_ZERO_PLAIN; > >>> + } > >>> + > >>> return type; > >>> > >>> You could even add a '&& bs->backing' to the condition and emit a > >>> warning to make it more explicit. > >> > >> No, this is wrong. This still wouldn’t fix the problem of having a > >> device file as the external data file, when it already has non-zero > >> data during creation. (Reading the qcow2 file would return zeroes, > >> but reading the device would not.) > > > > But you wouldn't fix that preallocating the metadata either, you would > > need to fill the device with zeroes. > > What it fixes is that reading the qcow2 image and the raw device returns > the same data. > > Initially, I also thought that we should initialize raw data files to be > zero during creation, but Eric changed my mind: > > https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00223.html > > >> I interpret the spec in that the metadata can be ignored, but it does > >> not need to be ignored. So the L1/L2 tables must be 1:1 mapping of > >> QCOW2_CLUSTER_NORMAL entries. > >> > >> We could also choose to interpret it as “With data-file-raw, the L1/L2 > >> tables must be ignored”. In that case, our qcow2 driver would need to > >> be modified to indeed fully ignore the L1/L2 tables with > >> data-file-raw. (I certainly don’t interpret the spec this way, but I > >> suppose we could call it a bug fix and amend it.) > > > > The way I interpret it is that regardless of whether you read the data > > through the qcow2 file or directly from the data file you should get the > > same results, but how that should be reflected in the L1/L2 metadata is > > not specified. > > That’s an absolute given, but the question is what does “reading through > the qcow2 file” mean. Respecting the metadata? Ignoring it? Something > in between? > > As I noted in my reply to myself, data-file-raw is an autoclear flag. > That means, an old version of qemu that doesn’t recognize the flag must > read the same data as a new version. It follows that the the L2 tables > must be a 1:1 mapping. (Or the flag can’t be an autoclear flag.)
Being able to read sounds like a nice to have feature, but what about writing? I hope that the image is not writable by older versions that do not understand data_file. Otherwise older qemu versions can corrupt the image silently.