On 2/19/20 10:21 AM, Eric Blake wrote:
It took me a few minutes of thinking about this.
Scenario 1:
base.raw <- wrap.qcow2
where wrap.qcow2 specifies backing of base.raw but not the format. If
we probe, we can have a couple of outcomes:
1a: base.raw probes as raw (the probed image has no backing or data
file), using it as raw is safe (it matches what wrap.qcow2 should have
specified but didn't, and we aren't changing the data the guest would
read nor are we opening unexpected files)
1b: base.raw probes as qcow2 (because of whatever the guest wrote
there), using it as qcow2 is wrong - the guest will see corrupted data.
What's more, if the probe sees it as qcow2 with backing file, and we
open the backing file, it also has security implications.
Scenario 2:
base.qcow2 <- wrap.qcow2
where wrap.qcow2 specifies backing of base.qcow2 but not the format. If
we probe, we will always have just one outcome:
2a: base.qcow2 probes as qcow2. Using it as qcow2 is correct, but if
base.qcow2 has a further backing image, the backing chain is now
dependent on a probe.
Since 1b and 2a have the same probe result, but massively different data
corruption and/or security concerns, it is NOT sufficient to claim that
a probe was safe merely because "the probed image does not have any
backing or data file". It is ONLY safe if the probe turns up raw. Any
other probed format is inherently unsafe.
Having said that, I can offer this heuristic:
If the file name itself implies a particular format was intended (such
as naming a file *.raw or *.qcow2), and the probe matches what the
suffix would imply, we can probably guess that the user did really mean
that format (we could even allow *.qcow to resolve to a probe of qcow2,
even though qcow and qcow2 are different probe results). File names
that do not imply a format (such as 'bare' or 'base.img' or ...) cannot
benefit from heuristics.
But we just got rid of suffix heuristics in patch 1 of this series. So
using file suffix hueristics to accept a filename of 'base.qcow2' that
probed as qcow2 as probably being safe, in spite of the missing format,
is risky business. While a heuristic might let more pre-existing cases
of missing formats boot without data corruption, we have to consider
whether the extra magic is helpful or will get in the way for the cases
where a management app misnamed their file (calling a file 'base.qcow2'
when it is really raw) and such misnaming causes our heuristics to allow
data corruption through to the guest.
Extra magic that can't go wrong is one thing, but extra magic that can
risk data corruption in corner cases where a management app was not
careful with their file names is probably not worth the maintenance
effort. So I agree with patch 1 removing suffix probing, and think it
is not worth trying the file name heuristics just to help management
apps that forgot their backing format.
I disagree with the logic here. What we really need is:
If the backing format was not specified, we probe to see what is there.
If the result of that probe is raw, it is safe to treat the image as
raw. If the result is anything else, we must report an error stating
that what we probed could not be trusted unless the user adds an
explicit backing format (either confirming that our probe was correct,
or with the correct format overriding what we mis-probed).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org