Hi all, Been running into an issue with qcow2 disk corruption, hoping we can get pointed in the right direction. We're currently using latest qemu from Trusty.
The issue started after powering a VM off and on again. One first boot, the guest (CentOS 6) started reporting I/O issues almost immediately and then crashed. Following that, the VM was unable to read the disk (kept looping through BIOS boot process). The disk has a single snapshot, which we were able to get working by following this process: - Attempt to apply snap. Supposedly fails. - Run qemu-img check + repair - Use qemu-img convert to convert qcow2 to qcow2 Once complete, we were able to boot from the disk, however it was at the point that the snapshot was taken. We have attempted to do a check+repair and then convert without applying the snapshot, but are running into the following errors: - qemu-img check + repair: Warning: cluster offset=0x2d3120706a0000 is after the end of the image file, can't properly check refcounts. ERROR offset=2d312070696e00: Cluster is not properly aligned; L2 entry corrupted. Warning: cluster offset=0x2d310a43500000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0x2d310a43510000 is after the end of the image file, can't properly check refcounts. ERROR offset=2d310a43505500: Cluster is not properly aligned; L2 entry corrupted. Warning: cluster offset=0x20496e74650000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0x20496e74660000 is after the end of the image file, can't properly check refcounts. ERROR offset=20496e74656c00: Cluster is not properly aligned; L2 entry corrupted. Warning: cluster offset=0x2f6d6d6f6e0000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0xd2070726f0000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0xd207072700000 is after the end of the image file, can't properly check refcounts. Warning: cluster offset=0x336f7220730000 is after the end of the image file, can't properly check refcounts. - qemu-img convert: qemu-img: error while reading block status of sector 147456: Input/output error Here's qemu-img from that disk: image: disk.pre-convert file format: qcow2 virtual size: 180G (193273528320 bytes) disk size: 153G cluster_size: 65536 backing file: /var/lib/nova/instances/_base/xxx Snapshot list: ID TAG VM SIZE DATE VM CLOCK 67 xxx 0 2016-04-14 05:22:34 00:00:00.000 Note that the virtual size has been increased from 80G. It previously looked like this: image: disk.pre-convert file format: qcow2 virtual size: 80G (85899345920 bytes) disk size: 153G cluster_size: 65536 backing file: /var/lib/nova/instances/_base/c45e2e81d34824861271a098bccd5585128e2c05 Snapshot list: ID TAG VM SIZE DATE VM CLOCK 67 e50825fbd43e455283ef847b12eaea4c 0 2016-04-14 05:22:34 00:00:00.000 We've tried using qcow2.py from src to clear the snapshot headers, however it didn't help.