Hello Kevin, hello Michael, hello *, we noticed a data corruption bug in qemu-1.1.2, which will be shipped by Debian and our own Debian based distibution. The corruption mostly manifests while installing large Debian package files and seems to be reladed to memory preasure: As long as the file is still in the page cache, everything looks fine, but when the file is re-read from the virtual hard disk using a qcow2 file backed by another qcow2 file, the file is corrupted: dpkg complains that the .tar.gz file inside the Debian archive file is corrupted and the md5sum no longer matches.
I tracked this down using "git bisect" to your patch attached below, which fixed this bug, so everything is fine with qemu-kvm-1.2.0. From my reading this seems to explain our problems, since during my own testing during development I never used backing chains and the problem only showed up when my collegues started using qemu-kvm-1.1.2 with their VMs using backing chains. @Kevin: Do you thinks that's a valid explanation and your patch should fix that problem? I'd like to get your expertise before filing a bug with Debian and asking Michael to include that patch with his next stable update for 1.1. Thanks in advance. Sincerely Philipp -- Philipp Hahn Open Source Software Engineer h...@univention.de Univention GmbH be open. fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/
--- Begin Message ---avail_sectors should really be the number of sectors from the start of the allocation, not from the start of the write request. We're lucky enough that this mistake didn't cause any real bug. avail_sectors is only used in the intialiser of QCowL2Meta: .nb_available = MIN(requested_sectors, avail_sectors), m->nb_available in turn is only used for COW at the end of the allocation. A COW occurs only if the request wasn't cluster aligned, which in turn would imply that requested_sectors was less than avail_sectors (both in the original and in the fixed version). In this case avail_sectors is ignored and therefore the mistake doesn't cause any misbehaviour. Signed-off-by: Kevin Wolf <kw...@redhat.com> --- block/qcow2-cluster.c | 10 +++++++++- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 98fba71..d7e0e19 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -947,8 +947,16 @@ again: /* save info needed for meta data update */ if (nb_clusters > 0) { + /* + * requested_sectors: Number of sectors from the start of the first + * newly allocated cluster to the end of the (possibly shortened + * before) write request. + * + * avail_sectors: Number of sectors from the start of the first + * newly allocated to the end of the last newly allocated cluster. + */ int requested_sectors = n_end - keep_clusters * s->cluster_sectors; - int avail_sectors = (keep_clusters + nb_clusters) + int avail_sectors = nb_clusters << (s->cluster_bits - BDRV_SECTOR_BITS); *m = (QCowL2Meta) { -- 1.7.6.5
--- End Message ---
signature.asc
Description: This is a digitally signed message part.