On Mon 26 Feb 2018 02:40:08 PM CET, Max Reitz wrote: >> + "L1 table is too large; snapshot table entry >> corrupted\n", >> + (fix & BDRV_FIX_ERRORS) ? "Deleting" : "ERROR", >> + sn->id_str, sn->name, sn->l1_size); >> + found_corruption = true; >> + } > > This code assumes the snapshot table itself has been valid. Why should > it be when it contains garbage entries? > >> + >> + if (found_corruption) { >> + result->corruptions++; >> + sn->l1_size = 0; /* Prevent this L1 table from being used */ >> + if (fix & BDRV_FIX_ERRORS) { >> + ret = qcow2_snapshot_delete(bs, sn->id_str, sn->name, >> NULL); > > So calling this is actually very dangerous. It modifies the snapshot > table which I wouldn't trust is actually just a snapshot table. It > could intersect any other structure in the qcow2 image. Yes, we do an > overlap check, but that only protects metadata, and I don't really > want to see an overlap check corruption when repairing the image; > especially since this means you cannot fix the corruption.
I think you're right. One thing that could be done is check that the area where the snapshot table is supposed to be doesn't overlap with any other data structures. > I don't quite know myself what to do instead, but I guess my main > point would be: Before any (potentially) destructive changes are made, > the user should have the chance of still opening the image read-only > and copying all the data off somewhere else. Which of course again > means we shouldn't prevent the user from opening an image because a > snapshot is broken. Ok, you've convinced me. I'll see what I can come up with. I guess initially we can simply add checks for sn->l1_table_offset and sn->l1_size in the places where they're used. Berto