Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
At 07/28/2011 05:41 AM, Eric Blake Write: On 07/07/2011 05:34 PM, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area git bisect is pointing to this series as the cause of a regression in 'virsh managedsave dom' triggering libvirtd core dumps if some other process is actively making queries on domain at the same time (virt-manager is a great process for fitting that bill). I'm trying to further narrow down which patch introduced the regression, and see if I can plug the race (probably a case of not checking whether the monitor still exists when getting the condition for an asynchronous job, since the whole point of virsh [managed]save is that the domain will go away when the save completes, but that it is time-consuming enough that we want to query domain state in the meantime). I can reproduce this bug. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x706d0700 (LWP 11419)] 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 801while (!mon-msg-finished) { The reason is that mon-msg is NULL. I add some debug codes, and found that we send monitor command while the last command is not finished, and then libvirtd crashed. After reading the code, I think something is wrong in the function qemuDomainObjEnterMonitorInternal(): if (priv-job.active == QEMU_JOB_NONE priv-job.asyncJob) { if (qemuDomainObjBeginNestedJob(driver, obj) 0) We can run query job while asyncJob is running. When we query the migration's status, priv-job.active is not QEMU_JOB_NONE, and we do not wait the query job finished. So we send monitor command while last command is not finished. It's very dangerous. When we run a async job, we can not know whether the job is nested async job according to priv-job.active's value. I think we should introduce four functions for async nested job: qemuDomainObjAsyncEnterMonitor() qemuDomainObjAsyncEnterMonitorWithDriver() qemuDomainObjAsyncExitMonitor() qemuDomainObjAsyncExitMonitorWithDriver() The qemuDomainObjEnterMonitorInternal()'s caller should pass a bool value to tell qemuDomainObjEnterMonitorInternal() whether the job is a async nested job. Thanks Wen Congyang. (gdb) bt #0 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 #1 0x004c77ae in qemuMonitorJSONCommandWithFd (mon=0x7fffe815c060, cmd=0x7fffd8000940, scm_fd=-1, reply=0x706cf480) at qemu/qemu_monitor_json.c:225 #2 0x004c78e5 in qemuMonitorJSONCommand (mon=0x7fffe815c060, cmd=0x7fffd8000940, reply=0x706cf480) at qemu/qemu_monitor_json.c:254 #3 0x004cc19c in qemuMonitorJSONGetMigrationStatus ( mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor_json.c:1920 #4 0x004bc1b3 in qemuMonitorGetMigrationStatus (mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor.c:1532 #5 0x004b201b in qemuMigrationUpdateJobStatus (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, job=0x5427b6 domain save job) at qemu/qemu_migration.c:765 #6 0x004b2383 in qemuMigrationWaitForCompletion ( driver=0x7fffe80089f0, vm=0x7fffe8015cd0) at qemu/qemu_migration.c:846 #7 0x004b7806 in qemuMigrationToFile (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, fd=27, offset=4096, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressor=0x0, is_reg=true, bypassSecurityDriver=true) at qemu/qemu_migration.c:2766 #8 0x0046a90d in qemuDomainSaveInternal (driver=0x7fffe80089f0, dom=0x7fffd8000ad0, vm=0x7fffe8015cd0, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressed=0, bypass_cache=false) at qemu/qemu_driver.c:2386 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
At 07/28/2011 03:26 PM, Wen Congyang Write: At 07/28/2011 05:41 AM, Eric Blake Write: On 07/07/2011 05:34 PM, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area git bisect is pointing to this series as the cause of a regression in 'virsh managedsave dom' triggering libvirtd core dumps if some other process is actively making queries on domain at the same time (virt-manager is a great process for fitting that bill). I'm trying to further narrow down which patch introduced the regression, and see if I can plug the race (probably a case of not checking whether the monitor still exists when getting the condition for an asynchronous job, since the whole point of virsh [managed]save is that the domain will go away when the save completes, but that it is time-consuming enough that we want to query domain state in the meantime). I can reproduce this bug. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x706d0700 (LWP 11419)] 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 801while (!mon-msg-finished) { The reason is that mon-msg is NULL. I add some debug codes, and found that we send monitor command while the last command is not finished, and then libvirtd crashed. After reading the code, I think something is wrong in the function qemuDomainObjEnterMonitorInternal(): if (priv-job.active == QEMU_JOB_NONE priv-job.asyncJob) { if (qemuDomainObjBeginNestedJob(driver, obj) 0) We can run query job while asyncJob is running. When we query the migration's status, priv-job.active is not QEMU_JOB_NONE, and we do not wait the query job finished. So we send monitor command while last command is not finished. It's very dangerous. When we run a async job, we can not know whether the job is nested async job according to priv-job.active's value. I think we should introduce four functions for async nested job: Some functions(for example qemuProcessStopCPUs) can be used by sync job and async job. So we do not know which type job call these functions when we enter these functions. We support run sync job and async job at the same time. It means that the monitor commands for two jobs can be run in any order. Another way to fix this bug is: If we try to send monitor command while the last command is not finished, we wait the last monitor command to finish. qemuDomainObjAsyncEnterMonitor() qemuDomainObjAsyncEnterMonitorWithDriver() qemuDomainObjAsyncExitMonitor() qemuDomainObjAsyncExitMonitorWithDriver() The qemuDomainObjEnterMonitorInternal()'s caller should pass a bool value to tell qemuDomainObjEnterMonitorInternal() whether the job is a async nested job. Thanks Wen Congyang. (gdb) bt #0 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 #1 0x004c77ae in qemuMonitorJSONCommandWithFd (mon=0x7fffe815c060, cmd=0x7fffd8000940, scm_fd=-1, reply=0x706cf480) at qemu/qemu_monitor_json.c:225 #2 0x004c78e5 in qemuMonitorJSONCommand (mon=0x7fffe815c060, cmd=0x7fffd8000940, reply=0x706cf480) at qemu/qemu_monitor_json.c:254 #3 0x004cc19c in qemuMonitorJSONGetMigrationStatus ( mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor_json.c:1920 #4 0x004bc1b3 in qemuMonitorGetMigrationStatus (mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor.c:1532 #5 0x004b201b in qemuMigrationUpdateJobStatus (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, job=0x5427b6 domain save job) at qemu/qemu_migration.c:765 #6 0x004b2383 in qemuMigrationWaitForCompletion ( driver=0x7fffe80089f0, vm=0x7fffe8015cd0) at qemu/qemu_migration.c:846 #7 0x004b7806 in qemuMigrationToFile (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, fd=27, offset=4096, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressor=0x0, is_reg=true, bypassSecurityDriver=true) at qemu/qemu_migration.c:2766 #8 0x0046a90d in qemuDomainSaveInternal (driver=0x7fffe80089f0, dom=0x7fffd8000ad0, vm=0x7fffe8015cd0, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressed=0, bypass_cache=false) at qemu/qemu_driver.c:2386 -- libvir-list mailing list libvir-list@redhat.com
Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
On 07/07/2011 05:34 PM, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area git bisect is pointing to this series as the cause of a regression in 'virsh managedsave dom' triggering libvirtd core dumps if some other process is actively making queries on domain at the same time (virt-manager is a great process for fitting that bill). I'm trying to further narrow down which patch introduced the regression, and see if I can plug the race (probably a case of not checking whether the monitor still exists when getting the condition for an asynchronous job, since the whole point of virsh [managed]save is that the domain will go away when the save completes, but that it is time-consuming enough that we want to query domain state in the meantime). Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x706d0700 (LWP 11419)] 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 801 while (!mon-msg-finished) { (gdb) bt #0 0x004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x706cf380) at qemu/qemu_monitor.c:801 #1 0x004c77ae in qemuMonitorJSONCommandWithFd (mon=0x7fffe815c060, cmd=0x7fffd8000940, scm_fd=-1, reply=0x706cf480) at qemu/qemu_monitor_json.c:225 #2 0x004c78e5 in qemuMonitorJSONCommand (mon=0x7fffe815c060, cmd=0x7fffd8000940, reply=0x706cf480) at qemu/qemu_monitor_json.c:254 #3 0x004cc19c in qemuMonitorJSONGetMigrationStatus ( mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor_json.c:1920 #4 0x004bc1b3 in qemuMonitorGetMigrationStatus (mon=0x7fffe815c060, status=0x706cf580, transferred=0x706cf570, remaining=0x706cf568, total=0x706cf560) at qemu/qemu_monitor.c:1532 #5 0x004b201b in qemuMigrationUpdateJobStatus (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, job=0x5427b6 domain save job) at qemu/qemu_migration.c:765 #6 0x004b2383 in qemuMigrationWaitForCompletion ( driver=0x7fffe80089f0, vm=0x7fffe8015cd0) at qemu/qemu_migration.c:846 #7 0x004b7806 in qemuMigrationToFile (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, fd=27, offset=4096, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressor=0x0, is_reg=true, bypassSecurityDriver=true) at qemu/qemu_migration.c:2766 #8 0x0046a90d in qemuDomainSaveInternal (driver=0x7fffe80089f0, dom=0x7fffd8000ad0, vm=0x7fffe8015cd0, path=0x7fffd8000990 /var/lib/libvirt/qemu/save/fedora_12.save, compressed=0, bypass_cache=false) at qemu/qemu_driver.c:2386 -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
On Fri, Jul 08, 2011 at 01:34:05 +0200, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area The series is not perfect and still needs some corner cases to be fixed but I think it's better to send the series for review now and add small additional fixes in the next version(s) instead of waiting for it to be perfect. OK, I pushed the following patches (01-08/19 and 13/19) which were already acked. When doing so, I also updated documentation in src/qemu/THREADS.txt as part of the qemu: Allow all query commands to be run during long jobs patch. The diff to THREADS.txt is attached. qemu: Separate job related data into a new object qemu: Consolidate BeginJob{,WithDriver} into a single method qemu: Consolidate {Enter,Exit}Monitor{,WithDriver} qemu: Allow all query commands to be run during long jobs qemu: Save job type in domain status XML qemu: Recover from interrupted jobs qemu: Add support for job phase qemu: Consolidate qemuMigrationPrepare{Direct,Tunnel} qemu: Fix monitor unlocking in some error paths Jirka diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt index 1e0b5ab..3a27a85 100644 --- a/src/qemu/THREADS.txt +++ b/src/qemu/THREADS.txt @@ -49,17 +49,39 @@ There are a number of locks on various objects - * qemuMonitorPrivatePtr: Job condition + * qemuMonitorPrivatePtr: Job conditions Since virDomainObjPtr lock must not be held during sleeps, the job -condition provides additional protection for code making updates. +conditions provide additional protection for code making updates. + +Qemu driver uses two kinds of job conditions: asynchronous and +normal. + +Asynchronous job condition is used for long running jobs (such as +migration) that consist of several monitor commands and it is +desirable to allow calling a limited set of other monitor commands +while such job is running. This allows clients to, e.g., query +statistical data, cancel the job, or change parameters of the job. + +Normal job condition is used by all other jobs to get exclusive +access to the monitor and also by every monitor command issued by an +asynchronous job. When acquiring normal job condition, the job must +specify what kind of action it is about to take and this is checked +against the allowed set of jobs in case an asynchronous job is +running. If the job is incompatible with current asynchronous job, +it needs to wait until the asynchronous job ends and try to acquire +the job again. Immediately after acquiring the virDomainObjPtr lock, any method -which intends to update state must acquire the job condition. The -virDomainObjPtr lock is released while blocking on this condition -variable. Once the job condition is acquired, a method can safely -release the virDomainObjPtr lock whenever it hits a piece of code -which may sleep/wait, and re-acquire it after the sleep/wait. +which intends to update state must acquire either asynchronous or +normal job condition. The virDomainObjPtr lock is released while +blocking on these condition variables. Once the job condition is +acquired, a method can safely release the virDomainObjPtr lock +whenever it hits a piece of code which may sleep/wait, and +re-acquire it after the sleep/wait. Whenever an asynchronous job +wants to talk to the monitor, it needs to acquire nested job (a +special kind of normla job) to obtain exclusive access to the +monitor. Since the virDomainObjPtr lock was dropped while waiting for the job condition, it is possible that the domain is no longer active @@ -111,31 +133,74 @@ To lock the virDomainObjPtr -To acquire the job mutex +To acquire the normal job condition qemuDomainObjBeginJob() (if driver is unlocked) - Increments ref count on virDomainObjPtr -- Wait qemuDomainObjPrivate condition 'jobActive != 0' using - virDomainObjPtr mutex -- Sets jobActive to 1 +- Waits until the job is compatible with current async job or no + async job is running +- Waits job.cond condition 'job.active != 0' using virDomainObjPtr + mutex +- Rechecks if the job is still compatible and repeats waiting if it + isn't +- Sets job.active to the job type qemuDomainObjBeginJobWithDriver() (if driver needs to be locked) -- Unlocks driver - Increments ref count on virDomainObjPtr -- Wait qemuDomainObjPrivate condition 'jobActive != 0' using - virDomainObjPtr mutex -- Sets jobActive to 1 +- Unlocks driver +
Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
On Mon, Jul 11, 2011 at 21:33:41 +0100, Daniel P. Berrange wrote: On Fri, Jul 08, 2011 at 01:34:05AM +0200, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area The series is not perfect and still needs some corner cases to be fixed but I think it's better to send the series for review now and add small additional fixes in the next version(s) instead of waiting for it to be perfect. What level of testing has it had so far ? In particular does it pass the giant migration test case in the TCK ? Only manual testing so far with a special focus on failed migration and recovery/rollback. I tested successful migrations as well, but only a limited set of options of normal/p2p migration and libvirt was exactly the same version for client/source/dest. I still need to run this through the migration test case, but I don't have a setup for that though. Could you run this easily through the beast? That would be perfect but if not, I'll create the setup for it and run it myself. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/19] Rollback migration when libvirtd restarts
On Fri, Jul 08, 2011 at 01:34:05AM +0200, Jiri Denemark wrote: This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recovery The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area The series is not perfect and still needs some corner cases to be fixed but I think it's better to send the series for review now and add small additional fixes in the next version(s) instead of waiting for it to be perfect. What level of testing has it had so far ? In particular does it pass the giant migration test case in the TCK ? And if so, what combinations of client/source/dest libvirt were tested. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list