[Qemu-devel] [Bug 1250360] [NEW] qcow2 image logical corruption after host crash

Blue Tue, 12 Nov 2013 11:14:48 -0800

Public bug reported:

Description of problem:
In case of power failure disk images that were active and created in qcow2 
format can become logically corrupt so that they actually appear as unused 
(full of zeroes).
Data seems to be there, but at this moment i cannot find any reliable method to 
recover it. Should it be a raw image, a recovery path would be available, but a 
qcow2 image only presents zeroes once it gets corrupted. My understanding is 
that the blockmap of the image gets reset and the image is then assumed to be 
unused.
My detailed setup :


Kernel 2.6.32-358.18.1.el6.x86_64
qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64
The image was used from a NFS share (the nfs server did NOT crash and remained 
permanently active).

qemu-img check finds no corruption;
qemu-img convert will fully convert the image to raw at a raw image full of 
zeroes. However, there is data in the file, and the storage backend was not 
restarted, inactivated during the incident.
I encountered this issue on two different machines, in both cases i was not 
able to recover the data.
Image was qcow2, thin provisioned, created like this :
 qemu-img create -f qcow2 -o cluster_size=2M imagename.img

While addressing the root cause in order to not have this issue repeat
would be the ideal scenario, a temporary workaround to run on the
affected qcow2 image to "patch" it and recover the data (eventually
after a full fsck/recovery inside the guest) would also be good.
Otherwise we are basically losing data on a large scale when using
qcow2.


Version-Release number of selected component (if applicable):
Kernel 2.6.32-358.18.1.el6.x86_64
qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64

How reproducible:
I am not able (and don't have at the moment enough resources to try to manually 
reproduce it), but the probability of the issue seems quite high as this is the 
second case of such corruption in weeks.
Additional info:
I can privately provide an image displaying the corruption.

The reported problem has actually two aspects : first is the cause that 
eventually produces this issue.
The second is the fact that once the logical corruption has occured, qemu-img 
check finds nothing wrong with the image - this is obviously wrong.

** Affects: qemu
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1250360

Title:
  qcow2 image logical corruption after host crash

Status in QEMU:
  New

Bug description:
  Description of problem:
  In case of power failure disk images that were active and created in qcow2 
format can become logically corrupt so that they actually appear as unused 
(full of zeroes).
  Data seems to be there, but at this moment i cannot find any reliable method 
to recover it. Should it be a raw image, a recovery path would be available, 
but a qcow2 image only presents zeroes once it gets corrupted. My understanding 
is that the blockmap of the image gets reset and the image is then assumed to 
be unused.
  My detailed setup :

  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64
  The image was used from a NFS share (the nfs server did NOT crash and 
remained permanently active).

  qemu-img check finds no corruption;
  qemu-img convert will fully convert the image to raw at a raw image full of 
zeroes. However, there is data in the file, and the storage backend was not 
restarted, inactivated during the incident.
  I encountered this issue on two different machines, in both cases i was not 
able to recover the data.
  Image was qcow2, thin provisioned, created like this :
   qemu-img create -f qcow2 -o cluster_size=2M imagename.img

  While addressing the root cause in order to not have this issue repeat
  would be the ideal scenario, a temporary workaround to run on the
  affected qcow2 image to "patch" it and recover the data (eventually
  after a full fsck/recovery inside the guest) would also be good.
  Otherwise we are basically losing data on a large scale when using
  qcow2.


  Version-Release number of selected component (if applicable):
  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64

  How reproducible:
  I am not able (and don't have at the moment enough resources to try to 
manually reproduce it), but the probability of the issue seems quite high as 
this is the second case of such corruption in weeks.
  Additional info:
  I can privately provide an image displaying the corruption.

  The reported problem has actually two aspects : first is the cause that 
eventually produces this issue.
  The second is the fact that once the logical corruption has occured, qemu-img 
check finds nothing wrong with the image - this is obviously wrong.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1250360/+subscriptions

[Qemu-devel] [Bug 1250360] [NEW] qcow2 image logical corruption after host crash

Reply via email to