Launchpad has imported 15 comments from the remote bug at https://bugzilla.redhat.com/show_bug.cgi?id=676205.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2011-02-09T04:13:47+00:00 Douglas wrote: Description of problem: Customer cannot resume virtual machine, below the error: # virsh resume v_rhel5_prod error: Failed to resume domain v_rhel5_prod error: Timed out during operation: cannot acquire state change lock Version-Release number of selected component (if applicable): libvirt-0.8.2-15.el5_6.1.x86_64 libvirt-python-0.8.2-15.el5_6.1.x86_64 Also tried: # virsh destroy v_rhel5_prod error: Failed to destroy domain v_rhel5_prod error: Timed out during operation: cannot acquire state change lock # virsh start v_rhel5_prod error: Domain is already active Additional info: Even rebooting the host the VM keep locked. Similar issue: https://bugzilla.redhat.com/show_bug.cgi?id=668438 Attached the debug logs Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/0 ------------------------------------------------------------------------ On 2011-02-09T04:18:50+00:00 Douglas wrote: Created attachment 477730 debug logs Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/1 ------------------------------------------------------------------------ On 2011-02-09T17:18:24+00:00 Daniel wrote: Are there any files in /var/lib/libvirt/qemu/save or /var/lib/libvirt/qemu/snapshot ? And is the 'libvirt-guests' initscript enabled on boot ? Most likely guess would be that there was a saved guest that failed to restore properly on boot Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/2 ------------------------------------------------------------------------ On 2011-02-15T21:34:53+00:00 Douglas wrote: Hello Daniel, Sorry the delay here, customer decided to moving to RHEL 6. They are uploading these files to further analyze. Would you like to make any suggestion? Thanks Douglas Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/3 ------------------------------------------------------------------------ On 2011-02-21T14:54:19+00:00 John wrote: Hi, I am running libvirtd with kvm on a Debian Squeeze host and I am experiencing the same problem from time to time. I'm using virt-manager to control my virtual machines and sometimes libvirtd runs into problems controlling a kvm domain. I cannot exactly say, when the problem occurs but it usually happens when I start and stop several virtual machines one after another. I.e., I have several virtual machines with test installations for development and since some of them are running unstable versions (Debian for example), I start all the VMs once a week to update them, usually not more than two at the same time since my kvm host has only 4GB of RAM. It then usually happens that the state of the virtual machines is not updated in virt-manager and when trying to start a virtual machine which is powered off, I receive the aforementioned error message. The problem is always fixed by: killall -9 libvirtd rm /var/run/libvirtd.pid /etc/init.d/libvirt-bin restart The virtual machines are never affected by this problem, they still continue to run without any problems. It simply seems that libvirtd at some point cannot connect to the kvm host anymore due to a race condition. I'm attaching a screenshot of the error message in virt- manager the last time it happened. In this case, I logged into my Debian Squeeze kvm host over ssh and used X-forwarding to display virt-manager on the MacOS X host. virt-manager was not running on the Mac. Version numbers: dpkg -l libvirt\* |grep -e '^ii' ii libvirt-bin 0.8.3-5 the programs for the libvirt library ii libvirt0 dpkg -l virt\* |grep -e '^ii' ii virt-manager 0.8.4-8 desktop application for managing virtual machines ii virt-viewer 0.2.1-1 Displaying the graphical console of a virtual machine ii virtinst 0.500.3-2 Regards, Adrian Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/4 ------------------------------------------------------------------------ On 2011-02-21T14:56:13+00:00 John wrote: Created attachment 479933 Screenshot of virt-manager running on Debian Squeeze (over X-forwarding on MacOSX) when the problem with libvirtd occured Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/5 ------------------------------------------------------------------------ On 2011-04-29T15:32:28+00:00 John wrote: I had the same problem. Running RHEL5.6 host machine with latest patches as of today. # lsb_release -r Release: 5.6 # uname -a Linux lark.cs.unc.edu 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux # rpm -qa|grep virt virt-manager-0.6.1-13.el5 libvirt-0.8.2-15.el5_6.3 libvirt-0.8.2-15.el5_6.3 python-virtinst-0.400.3-11.el5 libvirt-python-0.8.2-15.el5_6.3 Installed a RHEL6.0 virtual machine, install went fine. The install rebooted at the end, the virtual machine window hung, the virt-manger window hung, let them sit for 5 minutes or so then killed them. Have 2 other old virtual machines that continued to run. Tried to start the new machine: virsh # start lark-virtx error: Failed to start domain lark-virtx error: Timed out during operation: cannot acquire state change lock Stopped librvirtd: service libvirtd stop Removed run directory: rm -rf /var/run/libvirt Note after the shutdown there was no /var/run/libvirt.pid file Started libvirtd: service libvirtd start Was able to use virt-manager to start the new virtual machine. Thanks Adrian, that worked! Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/22 ------------------------------------------------------------------------ On 2011-04-29T17:19:41+00:00 John wrote: Hi John, on a side note: I recently upgraded libvirt from 0.8.2 to 0.9.0 and haven't seen the problem ever since. So, if you have the possibility to upgrade your libvirtd to the more recent version 0.9.0 or newer, I highly suggest you to do so and see if that permanently fixes the problem for you as well. It's certainly also nice for the maintainers/developers to know whether the new version fixes the bug and if several people independently claim it does, they will be able to change this bug report to "fixed" =). Greetings from Norway, Adrian Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/23 ------------------------------------------------------------------------ On 2011-05-02T19:08:02+00:00 John wrote: Got a chance to play with this again for a few minutes. If I login and halt a vm or do a Shut Down from virt-manager this consistently hangs virt-manager and the vnc client window. If I do a "service restart libvirtd" virt-manager is able to re-connect again. I do not have to remove any /var/run/libvirt* files. I have been running 2 virtual machines for over a year. I just noticed this problem because I created a new machine and started seeing virt-manager hanging with the "Timed out during operation: cannot acquire state change lock" error. Hope an update comes out soon. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/24 ------------------------------------------------------------------------ On 2011-05-02T19:18:59+00:00 John wrote: John, as I previously mentioned, the bug has been fixed in the version 0.9.0 and later. But since you are using an older version and cannot easily upgrade, the most reasonable solution would be a backport of the fix, which means that the appropriate lines of code that were changed in 0.9.0 to address this particular problem should also be changed in 0.8.x, however, without changing anything else to make sure that no other, possible new problems are introduced. I haven't checked the changelog of libvirt 0.9.0, so I don't know which change actually fixed the problem, but I am pretty sure that it can easily backported and will be backported since many people are actually using libvirt 0.8.x on RHEL which they have paid support for. Adrian Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/25 ------------------------------------------------------------------------ On 2011-05-02T19:35:26+00:00 John wrote: Yes, just thought having a consistent way, shutting down the system, to reproduce the problem would be some helpful information. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/26 ------------------------------------------------------------------------ On 2011-09-28T16:26:37+00:00 Daniel wrote: Summary of situation wrt "Timed out during operation: cannot acquire state change lock" There are a few reasons why you might see that error message in RHEL-5 1. The QEMU process has hung. QEMU won't respond to monitor commands. The API call making the first monitor command will wait forever, any subsequent API calls issuing monitor commands will timeout after ~30 seconds with this libvirt error message. This is expected behaviour when QEMU has hung. 2. The QEMU process is working on a very long/slow monitor command The API call making the long monitor command will wait until it (eventually) finishes. Any subsequent API calls wanting to issue monitor commands will wait upto ~30 seconds, for the first call to finish, after which they return this libvirt error message. This is also expected behaviour when one API call is running a very long monitor command. 3. Migration is aborted in between the 'Prepare' and 'Finish' step. Migration is a 3 phase process. First we 'Prepare' on the target host, acquiring the lock. Then we run on the source host. Finally we 'Finish' on the target host, releasing the lock. If the libvirt client dies/quits half way through, the lock may never be released. In this case, further monitor commands will return this libvirt error message. This is a bug 4. Libvirt has a bug in lock handling libvirt might run a monitor command, but forgets to release the 'state change lock' once complete. Again further monitor commands will return this message. This is a bug. In RHEL-6.2 we have done a number of things to address / mitigate these problems - It is now always possible to destroy a guest, even if the monitor is stuck. This lets you destroy a guest in scenario 1, which is not always possible with RHEL-5 libvirt, without restarting libvirtd. - Some pieces of code which held the lock for a long time, have been refactored to hold it for a much shorter period. This is primarily migration/save/restore/snapshot code. This should address some of the common reasons for seeing this error message - The migration code has been made more robust, to guarantee that all locks are released, even if migration client aborts/quits without calling Finish. So in RHEL-6.2, only scenario 1/2 should remain and those should occur less frequently, or at least be recoverable without requiring a libvirtd daemon restart, by killing the guest in question. The changes made in RHEL-6.1/6.2 to deal with this error message required alot of changes across all areas of the code. These changes would not be practical to backport to RHEL-5, because of the risk of them introducing regressions in other areas. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/27 ------------------------------------------------------------------------ On 2011-10-26T19:05:16+00:00 RHEL wrote: Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/30 ------------------------------------------------------------------------ On 2012-11-21T05:59:08+00:00 nigil wrote: I want to add: When tried to shutdown VM, it went to paused state and just hangs. Could not resume/shutdown vm from the paused state. [root@lnx132-75 vol_vm_data_disk_f63]# virsh list Id Name State ---------------------------------- 12 vm2_rhel6_x86_64 paused 15 vm6_win2003_x86_64 running 16 vm7_win2008_x86_64 running 17 vm8_win7_x86 running 18 vm9_win7_x86_64 running [root@lnx132-75]# virsh shutdown vm2_rhel6_x86_64 error: Failed to shutdown domain vm2_rhel6_x86_64 error: Timed out during operation: cannot acquire state change lock [root@lnx132-75]# virsh resume vm2_rhel6_x86_64 error: Failed to resume domain vm2_rhel6_x86_64 error: Timed out during operation: cannot acquire state change lock [root@lnx132-75]# virsh start vm2_rhel6_x86_64 error: Domain is already active [root@lnx132-75]# lsb_release -r Release: 5.8 [root@lnx132-75]# rpm -qa | grep libvirt libvirt-cim-0.5.8-3.el5 libvirt-0.8.2-25.el5 libvirt-0.8.2-25.el5 libvirt-python-0.8.2-25.el5 Found xml of the VM is saved as .save. [root@lnx132-75 save]# pwd /var/lib/libvirt/qemu/save [root@lnx132-75 save]# ls vm2_rhel6_x86_64.save Removed .save file and tried to resume/shutdown, but same issue has been observed. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/33 ------------------------------------------------------------------------ On 2012-11-21T10:58:10+00:00 nigil wrote: I want to add: When tried to shutdown VM, it went to paused state and just hangs. Could not resume/shutdown vm from the paused state. [root@lnx132-75 vol_vm_data_disk_f63]# virsh list Id Name State ---------------------------------- 12 vm2_rhel6_x86_64 paused 15 vm6_win2003_x86_64 running 16 vm7_win2008_x86_64 running 17 vm8_win7_x86 running 18 vm9_win7_x86_64 running [root@lnx132-75]# virsh shutdown vm2_rhel6_x86_64 error: Failed to shutdown domain vm2_rhel6_x86_64 error: Timed out during operation: cannot acquire state change lock [root@lnx132-75]# virsh resume vm2_rhel6_x86_64 error: Failed to resume domain vm2_rhel6_x86_64 error: Timed out during operation: cannot acquire state change lock [root@lnx132-75]# virsh start vm2_rhel6_x86_64 error: Domain is already active [root@lnx132-75]# lsb_release -r Release: 5.8 [root@lnx132-75]# rpm -qa | grep libvirt libvirt-cim-0.5.8-3.el5 libvirt-0.8.2-25.el5 libvirt-0.8.2-25.el5 libvirt-python-0.8.2-25.el5 Found xml of the VM is saved as .save. [root@lnx132-75 save]# pwd /var/lib/libvirt/qemu/save [root@lnx132-75 save]# ls vm2_rhel6_x86_64.save Removed .save file and tried to resume/shutdown, but same issue has been observed. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/734777/comments/34 ** Changed in: libvirt Status: Unknown => Won't Fix ** Changed in: libvirt Importance: Unknown => Critical ** Bug watch added: Red Hat Bugzilla #668438 https://bugzilla.redhat.com/show_bug.cgi?id=668438 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/734777 Title: "cannot acquire state change lock" problems To manage notifications about this bug go to: https://bugs.launchpad.net/libvirt/+bug/734777/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs