[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
FWIW, just re-reproduced this with latest upstream kernel / qemu / fresh qcow2 image. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
** Changed in: qemu (Ubuntu) Status: Confirmed = In Progress -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Excellent! Any chance you can start bisecting with http://people.canonical.com/~serge/binaries.{0..68}/{qemu-img,qemu- system-x86_64} ? -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Serge, So I was able to just compile my own qemu and test with that. I did attempt a reverse bisect, and was able to reproduce as early as v1.1 and also reproduce on master HEAD. v1.0 was inconclusive because qcow2 format I made with the newer binary seemed to be incompatible with v1.0; however from Jamies testing this seems to be a working version; so I'd say somewhere between v1.0.0, v1.1.0 lies the original change that enabled this issue. As I've been unable to reproduce this without virsh, reverse bisecting and using older qemu versions is a bit challenging as machine types change, features virsh wants to use aren't available, etc. Another interesting thing I tested today was I was able to reproduce with ext4 with extents disabled; maybe that gives more clues. Just to make sure I wasn't crazy, mkfs'd the partition to vanilla ext4 and iterated for most of the afternoon with no failures. My next steps are going to be enabling verbose output for qcow2, looking more deeply into what gets corrupted in the file, and turning on host filesystem debugging. --chris -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
can we confirm what filesystems and options are enabled when reproducing (ie, ext4 +extent mapping)[1] ? Bug 1368815 sounds very much like this. If the reproducing systems have ext4 extents mapping enabled, one could create an ext4 fs without extent mapping[2] and see if this still reproduces. If it is related to the ext4 extents, the rate of memory pressure and speed of the underlying device would determine whether or not the file ends up being corrupt which might explain the difficulty of reproducing. 1. % sudo tune2fs -l /dev/disk/by-id/dm-name-kriek--vg-root | grep -i features Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize 2. mke2fs -t ext4 -O ^extent /dev/device -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Ryan, The host's root filesystem is ext3/LVM (per Jamie's original configuration): sudo tune2fs -l /dev/disk/by-id/dm-name-ubuntu--vg-root | grep -i features Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Actually, for me it is just ext3 without LVM. $ sudo tune2fs -l /dev/sda3 | grep -i features Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Attached is a reproducer for this issue, here is what needs to be done to setup the reproducer: 1) The host machine's filesystem needs to be ext3 2) Install a VM (via virsh) and use a qcow2 disk 3) Ensure you can ssh without a password and the VM has bonnie++ installed 4) Adjust the variables in the script before running 5) Run the script a couple of times While this doesn't reproduce 100% of the time, I can usually get a failure within 1-3 trials. However executing this on a ext4 host filesystem I've been unable to reproduce this issue. ** Attachment added: lp1292234-repro.sh https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+attachment/4272431/+files/lp1292234-repro.sh -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Also I've been able to reproduce this with the latest master in qemu, and even with the latest daily 3.18-rcX kernel on the host. ** Also affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Ok I think I can reproduce this; after running some disk operations (bonnie++ and split a 100MB file), if I shutdown and try to boot the VM the disk cannot be booted and I'm presented with the grub menu. However this reproducer is not yet 100% reliable. Next week I'll work on bisecting it down after testing latest upstream. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Awesome - thank you Chris. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
** Changed in: qemu (Ubuntu) Assignee: Serge Hallyn (serge-hallyn) = Chris J Arges (arges) -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
$ od -x -N 72 forhallyn-trusty-amd64.img.corrupted | grep '[1-9]*' refcount_table_cluster 0100 000 4651 fb49 0200 020 1000 0200 040 1000 0300 060 0100 0100 0100 100 0500 nb_snapshots = 0100 snapshots_offset = 0500 $ od -x -N 72 forhallyn-trusty-amd64.img | grep '[1-9]*' 000 4651 fb49 0200 020 1000 0200 040 1000 0300 060 0100 0100 100 nb_snapshots = snapshots_offset = Looking at just the QCowHeader (and not de-scrambling BE format), I see the following differences; however I think this looks 'ok', I'll need to examine the rest of the file. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
FYI, I was able to reproduce this last night and uploaded forhallyn- trusty-amd64.img.corrupted.gz to https://chinstrap.canonical.com/~jamie/lp1292234/ for comparison with forhallyn-trusty-amd64.img.gz. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
On my main server (3.13.0-32-generic with precise userspace) I installed a trusty container with ext3 (LVM) backing store. There I installed uvt and created 4 VMs, 2 precise amd64 and 2 precise i386. I several times did: ubuntu@uvttest:~$ cat list p-precise-server-amd64 p-precise-server-i386 q-precise-server-i386 q-precise-server-amd64 ubuntu@uvttest:~$ for n in `cat list`; do uvt start -fr $n; done ubuntu@uvttest:~$ for n in `cat list`; do tmux splitw -p 25 -t $TMUX_PANE expect vmupgrade.expect $n; done where vmupgrade.expect is: = #!/usr/bin/expect set container [lrange $argv 0 0] spawn ssh $container #expect assword: #send -- ubuntu\r expect $container:~$ send -- export DEBIAN_FRONTEND=noninteractive\r send -- sudo sed -i 's/never/lts/' /etc/update-manager/release-upgrades\r expect assword for ubuntu: send -- ubuntu\r expect $container:~$ send -- sudo apt-get update\r expect $container:~$ send -- sudo do-release-upgrade -f DistUpgradeViewNonInteractive\r set timeout 11000 expect $container:~$ send -- sudo reboot\r = Then I find /lib -name xxx; sudo reboot; find /lib -name xxx; and look through dmesg for errors, then do ubuntu@uvttest:~$ for n in `cat list`; do uvt stop -fr $n; done Alas I've seen no corruption yet. The goal here isn't just to reproduce it, but to do so reliably enough to be able to bisect - this isn't it :( -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
This happened again with an important VM. I still don't have a reproducer for testing the bisect packages :( -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I just had this happen to me with 2.1+dfsg-3ubuntu3 on utopic. I had a VM I had been using for a days, then did a 'uvt stop -rf ...' followed by 'uvt update sec-utopic-amd64' and I was dropped to a grub rescue. :\ I'll downgrade again and regenerate the VM. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I've been running the scripts from comment #10. I have two VMs each running simultaneously; I've completed 24 hours of this sequence, about 50 total cycles with zero errors in the qcow2 images. We're missing something; possibly hardware specific? Host machine is an Intel NUC on Trusty. Linux kriek 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Ill see about increasing concurrency next. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
There are 69 commits to block/qcow* between 1.5.0 and 1.7.0. I have compiled binaries of qemu-system-x86_64 and qemu-img at each of those commits and pushed them to http://people.canonical.com/~serge/binaries.0 through http://people.canonical.com/~serge/binaries.68 Note that binaries.0 is the *latest* commit. So to bisect with these you could start with binaries.34, then if that shows corruption, try binaries.51, or if it does not, try binaries.17 etc. 6 steps should get us to a single commit. It's not certain that one of these commits caused the regression, but it seems a reasonable place to start. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I'm also starting work on updating uvt to use external snapshots instead; this would be an alternative to use while chasing down the bug in internal snapshots. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I tried to reproduce this many different ways with 2.1+dfsg-3ubuntu3 over the weekend and could not trigger the issue (with ksm enabled too). I don't know what version I had in comment #12. 2.1+dfsg-3ubuntu2 is plausible based on the date of the comment and the publication of this version, though I can't guarantee it wasn't 2.1+dfsg-2ubuntu2 or even 2.1+dfsg-2ubuntu1 though I did specifically mention I used 2.1. I don't see anything in the changes that jumps out that qcow2 corruption bugs were fixed since my comment, so I'm worried I just haven't been able to reproduce -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
For the reproducers, something worth trying is to use to try is external snapshots (instead of internal which the snapshot-create-as does without flags). instead run: snapshot-create-as --disk-only which will basically do qemu-img create -b your_original_qcow2 -f qcow2 pristine And store the snapshot delta in a separate file. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Hi Jamie, just to make sure, did you permanently disable ksm? Does cat /sys/kernel/mm/ksm/run still show 0? I've so far never seen a case where a reboot did not fix the issue, nor have I seen an issue (other than suspending the host sometimes causing the VM to hang so that I have to destroy it) with ksm disabled. I had hoped to do some large parallel upgrade tests this week, but network at linuxcon is not up to the task (even with apt-cacher-ng!) If I can find a better room I'll see about trying there. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I disabled KSM by setting /etc/default/qemu-kvm to have: KSM_ENABLED=0 and did 'sudo restart qemu-kvm'. I also rebooted before seeing the problem. Since then, I downgraded to saucy's qemu-kvm which reset KSM_ENABLED=1. I didn't specifically check /sys/kernel/mm/ksm/run and of course now this is set to '1'. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Ok - thanks Jamie. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
On utopic amd64, I tried the new qemu 2.1 packages and disabled KSM. They seemed to be ok for a while, but after using 'uvt update' today (which under the hood does what is decribed in the bug description), I lost 6 VMs to this bug. A reboot did not solve it. I've downgraded to saucy again. Unfortunately, the saucy packages are no longer supported and have stopped getting security updates. This is getting rather dire for me -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
As far as I know, everyone who has experienced this has been using a thinkpad. I've first experienced this myself last week, on a new thinkpad running utopic. Two curious things I noticed, beside this being a thinkpad: 1. I could not start the VM with the bad image at all. Until I rebooted. Then the image was fine, and fsck-clean. This suggests a possible problem with the page cache on the host. 2. I then disabled KSM. I have not seen this problem since then, however I also have not hit a vm quite as hard yet. Will have to see whether a series of package builds manages to make this happen again with KSM disabled. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
** Tags added: qcow2 -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I have a clean install of trusty on an intel laptop. I added the following upstart job in the forhallyn-trusty-amd64.img root partition: description update and shutdown author Serge Hallyn serge.hal...@ubuntu.com start on runlevel [2345] script sleep 5s apt-get update DEBIAN_FRONTEND='noninteractive' apt-get -y dist-upgrade sleep 5s shutdown -h now end script Then on the host I run this script: #!/bin/bash cp orig-with-upstart/forhallyn-trusty-amd64.img . virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot virsh start forhallyn-trusty-amd64 sleep 20s while [ 1 ]; do virsh list | grep -q forhallyn || break sleep 20s done # guest has updated. check the image file and fs here qemu-img check forhallyn-trusty-amd64.img if [ $? -ne 0 ]; then echo image check failed after shutdown exit 1 fi qemu-nbd -c /dev/nbd0 forhallyn-trusty-amd64.img fsck -a /dev/nbd0p1 if [ $? -ne 0 ]; then echo fs bad after shutdown qemu-nbd -d /dev/nbd0 exit 1 fi qemu-nbd -d /dev/nbd0 # now tweak the snapshots virsh snapshot-delete forhallyn-trusty-amd64 pristine --children virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot # and check the image file and fs again qemu-img check forhallyn-trusty-amd64.img if [ $? -ne 0 ]; then echo image check failed after snapshot remove/create exit 1 fi qemu-nbd -c /dev/nbd0 forhallyn-trusty-amd64.img fsck -a /dev/nbd0p1 if [ $? -ne 0 ]; then echo fs bad after snapshot remove/create qemu-nbd -d /dev/nbd0 exit 1 fi qemu-nbd -d /dev/nbd0 # all seems well exit 0 I'll run that in a loop and see if it fails after 10 tries. If you see anything there that I am NOT doing which would help to reproduce, please let me know. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
** Changed in: qemu (Ubuntu) Assignee: (unassigned) = Serge Hallyn (serge-hallyn) -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I believe I just tripped this bug; I compressed some qcow2 images using this: for f in sec-{lucid,precise,quantal,saucy,trusty}-{amd64,i386} ; do echo $f ; qemu-img convert -s pristine -p -f qcow2 -O qcow2 $f.qcow2 reclaimed.qcow2 ; mv reclaimed.qcow2 $f.qcow2 ; virsh snapshot-delete $f --snapshotname pristine ; uvt snapshot $f ; done The 'uvt snapshot' command makes a snapshot named 'pristine'. AMD64 guests: sec-lucid-amd64 booted without trouble. sec-precise-amd64 reports: Booting from Hard Disk... Boot failed: not a booktable disk No bootable device. sec-quantal-amd64 reports: Booting from Hard Disk... error; file `/boot/grub/i386-pc/normal.mod' not found. grub rescue sec-saucy-amd64 reports: Booting from Hard Disk... error: file `/boot/grub/i386-pc/normal.mod' not found. Entering rescue mode... grub rescue sec-trusty-amd64 reports: Booting from Hard Disk... Boot failed: not a bootable disk No bootable device. i386 guests: sec-lucid-i386, sec-precise-i386, sec-quantal-i386, sec-saucy-i386 all booted fine. sec-trusty-i386 reports: Booting from Hard Disk... Boot failed: not a bootable disk No bootable device. I use the i386 VMs significantly less often than the amd64 VMs. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: qemu (Ubuntu) Status: New = Confirmed -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
FYI, I periodically use and follow the same procedure that Seth described (in fact, I did it yesterday) and had no problems with qemu 1.5.0+dfsg-3ubuntu5.4 (which I've apt pinned since reporting this bug). ** Description changed: The security team uses a tool (http://bazaar.launchpad.net/~ubuntu- bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 - $ cat /proc/version_signature + $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: - compat: 0.10 + compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: - OS type: Linux - Version: Ubuntu 14.04 - Memory: 768 - CPUs: 1 + OS type: Linux + Version: Ubuntu 14.04 + Memory: 768 + CPUs: 1 - Select managed or existing (Browse, new volume) - Create a new storage volume: - qcow2 - Max capacity: 8192 - Allocation: 0 + Select managed or existing (Browse, new volume) + Create a new storage volume: + qcow2 + Max capacity: 8192 + Allocation: 0 - Advanced: - NAT - kvm - x86_64 - firmware: default + Advanced: + NAT + kvm + x86_64 + firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create- as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc- i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc- i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. - Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.3 + Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I've not yet been able to definitively reproduce this. (On a bad nested qemu setup i had some issues which i think were unrelated). I've tried on a trusty laptop, and on a faster machine with a trusty container on a trusty kernel. Starting with the images you posted for me each time. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
I don't, I just used the options that our uvt command uses. I downgraded to saucy's qemu in the meantime so I can do my work. Do you need me to try some new test? I'm not sure it makes any difference, but note that I am using a trusty host and kernel. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Quoting Jamie Strandboge (ja...@ubuntu.com): I don't, I just used the options that our uvt command uses. I downgraded to saucy's qemu in the meantime so I can do my work. Do you need me to try some new test? sigh, maybe. I will keep trying. I'm not sure it makes any difference, but note that I am using a trusty host and kernel. Right, that's what I'm using. Have others on your team (who are not on the same thinkpad model :) seen this as well? Have you seen it on different types of machines? Does it happen more often if the machine is already working hard? I wonder if I can reproduce it manually with qemu-img and qemu-nbd. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Did you try with the image on https://chinstrap.canonical.com/~jamie/lp1292234/? I was only able to trigger it by using an old image, creating the snapshot, starting it, apt-get dist-upgrading, cleanly shutting down, then deleting the snapshot and creating another with the same name. Using a fresh install or a too new image doesn't do it for me (I guess enough has to happen in the guest to trigger it). Ie: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine uvt snapshot $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Quoting Jamie Strandboge (ja...@ubuntu.com): Did you try with the image on https://chinstrap.canonical.com/~jamie/lp1292234/? I was only able to Yup! I wget that, create the snapshot, upgrade, remove and create the snapshot, then start the vm. The upgrades take a long time so I've only tested it 3 times so far. How likely is the failure? Should I just keep going? -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Have not yet been able to reproduce this. I'm considering adding an upstart job to your image which updates and shuts down, so I can test this in a loop. Do you know whether (a) the --children option to snapshot delete or (b) using the same name for the new snapshot as the one you just delete are crucial to reproducing this? -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
** Changed in: qemu (Ubuntu) Importance: Undecided = High -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu in Ubuntu. https://bugs.launchpad.net/bugs/1292234 Title: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs