On 2/19/20 10:21 AM, Eric Blake wrote:

It took me a few minutes of thinking about this.

Scenario 1:

base.raw <- wrap.qcow2

where wrap.qcow2 specifies backing of base.raw but not the format.  If we probe, we can have a couple of outcomes:

1a: base.raw probes as raw (the probed image has no backing or data file), using it as raw is safe (it matches what wrap.qcow2 should have specified but didn't, and we aren't changing the data the guest would read nor are we opening unexpected files)

1b: base.raw probes as qcow2 (because of whatever the guest wrote there), using it as qcow2 is wrong - the guest will see corrupted data. What's more, if the probe sees it as qcow2 with backing file, and we open the backing file, it also has security implications.

Scenario 2:

base.qcow2 <- wrap.qcow2

where wrap.qcow2 specifies backing of base.qcow2 but not the format.  If we probe, we will always have just one outcome:

2a: base.qcow2 probes as qcow2. Using it as qcow2 is correct, but if base.qcow2 has a further backing image, the backing chain is now dependent on a probe.

Since 1b and 2a have the same probe result, but massively different data corruption and/or security concerns, it is NOT sufficient to claim that a probe was safe merely because "the probed image does not have any backing or data file".  It is ONLY safe if the probe turns up raw.  Any other probed format is inherently unsafe.

Having said that, I can offer this heuristic:

If the file name itself implies a particular format was intended (such as naming a file *.raw or *.qcow2), and the probe matches what the suffix would imply, we can probably guess that the user did really mean that format (we could even allow *.qcow to resolve to a probe of qcow2, even though qcow and qcow2 are different probe results). File names that do not imply a format (such as 'bare' or 'base.img' or ...) cannot benefit from heuristics.

But we just got rid of suffix heuristics in patch 1 of this series. So using file suffix hueristics to accept a filename of 'base.qcow2' that probed as qcow2 as probably being safe, in spite of the missing format, is risky business. While a heuristic might let more pre-existing cases of missing formats boot without data corruption, we have to consider whether the extra magic is helpful or will get in the way for the cases where a management app misnamed their file (calling a file 'base.qcow2' when it is really raw) and such misnaming causes our heuristics to allow data corruption through to the guest.

Extra magic that can't go wrong is one thing, but extra magic that can risk data corruption in corner cases where a management app was not careful with their file names is probably not worth the maintenance effort. So I agree with patch 1 removing suffix probing, and think it is not worth trying the file name heuristics just to help management apps that forgot their backing format.



I disagree with the logic here.  What we really need is:

If the backing format was not specified, we probe to see what is there. If the result of that probe is raw, it is safe to treat the image as raw.  If the result is anything else, we must report an error stating that what we probed could not be trusted unless the user adds an explicit backing format (either confirming that our probe was correct, or with the correct format overriding what we mis-probed).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Reply via email to