Am 19.01.2015 um 22:09 hat Max Reitz geschrieben: > On 2015-01-19 at 16:04, Eric Blake wrote: > >On 01/19/2015 01:49 PM, Max Reitz wrote: > >>With the series adding unalignment checks and the series reworking the > >>zero cluster expansion code overlapping, the unalignment checks have not > >>been implemented in the latter code. > >> > >>This series fixes it. > >> > >>There are other places which would require unalignment checks, like the > >>offsets of L1 tables, especially for snapshots; but because it would be > >>best to add these checks in the function which reads the snapshot table, > >>this would make images with broken snapshots completely unusable, which > >>is something I opted to avoid for now.
That's something I noticed, too: For qemu-img check, our qcow2_open() checks may be to strict. With a corrupted snapshot table, it won't even run. Perhaps we should be laxer with some checks with BDRV_O_CHECK, and for example just set s->nb_snapshots = 0 if loading the table fails. > >>Ideally, we need to make the qcow2 repair function repair such cases, > >>but until that is done there is not much we can do about them. > >What's the best repair? > > That's what I was asking myself... > > >Read the data from the unaligned location, and > >write a fresh copy into a new aligned allocation? > > Maybe. Maybe there is no way of repairing them and the only way is > asking the user whether it's fine to delete the snapshot (qemu-img > check -r make_consistent_whatever_it_takes). > > Also, the question remains for every unaligned data structure: L2 > tables, data clusters… The refcount structures are the only ones > that can be easily recovered. For data clusters, reading from the > unaligned location would probably be best; for L2 tables… maybe the > same, then run a consistency check on the data and if it seems off™, > throw the whole L2 table away. Reading from the unaligned location doesn't help. I have never seen an image whose L2 entries contained valid entries, except for the least significant few bits. If your table is corrupted, it's usually garbage from start to end. I think the only sane option is to drop the entries. The big question is what should be used to replace them. Kevin