On 31.07.19 11:22, Max Reitz wrote:
> On 30.07.19 21:08, Eric Blake wrote:
>> On 7/30/19 12:25 PM, Max Reitz wrote:
>>> We currently refuse to open qcow2 images with overly long snapshot
>>> tables.  This patch makes qemu-img check -r all drop all offending
>>> entries past what we deem acceptable.
>>>
>>> Signed-off-by: Max Reitz <mre...@redhat.com>
>>> ---
>>>  block/qcow2-snapshot.c | 89 +++++++++++++++++++++++++++++++++++++-----
>>>  1 file changed, 79 insertions(+), 10 deletions(-)
>>
>> I'm less sure about this one.  8/13 should have no semantic effect (if
>> the user _depended_ on that much extra data, they should have set an
>> incompatible feature flag bit, at which point we'd leave their data
>> alone because we don't recognize the feature bit; so it is safe to
>> assume the user did not depend on the data and that we can thus nuke it
>> with impunity).  But here, we are throwing away the user's internal
>> snapshots, and not even giving them a say in which ones to throw away
>> (more likely, by trimming from the end, we are destroying the most
>> recent snapshots in favor of the older ones - but I could argue that
>> throwing away the oldest also has its uses).
> 
> First, I don’t think there really is a legitimate use case for having an
> overly long snapshot table.  In fact, I think our limit is too high as
> it is and we just introduced it this way because we didn’t have any
> repair functionality, and so just had to pick some limit that nobody
> could ever reasonably reach.
> 
> (As the test shows, you need more than 500 snapshots with 64 kB names
> and ID strings, and 1 kB of extra data to reach this limit.)
> 
> So the only likely cause to reach this number of snapshots is
> corruption.  OK, so maybe we don’t need to be able to fix it, then,
> because the image is corrupted anyway.
> 
> But I think we do want to be able to fix it, because otherwise you just
> can’t open the image at all and thus not even read the active layer.
> 
> 
> This gets me to: Second, it doesn’t make things worse.  Right now, we
> just refuse to open such images in all cases.  I’d personally prefer
> discarding some data on my image over losing it all.
> 
> 
> And third, I wonder what interface you have in mind.  I think adding an
> interface to qemu-img check to properly address this problem (letting
> the user discard individual snapshots) is hard.  I could imagine two things:
> 
> (A) Making qemu-img snapshot sometimes set BDRV_O_CHECK, too, or
> something.  For qemu-img snapshot -d, you don’t need to read the whole
> table into memory, and thus we don’t need to impose any limit.  But that
> seems pretty hackish to me.
> 
> (B) Maybe the proper solution would be to add an interactive interface
> to bdrv_check().  I can imagine that in the future, we may get more
> cases where we want interaction with the user on what data to delete and
> so on.  But that's hard...  (I’ll try.  Good thing stdio is already the
> standard interface in bdrv_check(), so I won’t have to feel bad if I go
> down that route even further.)

After some fiddling around, I don’t think this is worth it.  As I said,
this is an extremely rare case anyway, so the main goal should be to
just being able to access the active layer to copy at least that data
off the image.

The other side is that this would introduce quite complex code that
basically cannot be tested reasonably.  I’d rather not do that.

Max

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to