[libvirt] [PATCH]: Fix non-live migration failure

2009-02-27 Thread Chris Lalancette
There is a logic error in the Qemu driver when doing a non-live migrate.
During a non-live migrate, on the source host during the Perform step, we
pause the domain; however, if there was ever a failure, we were forgetting
to unpause the domain, meaning that the domain was paused forever.  Add a
flag to tell us when we should unpause the domain after a failure.

Also, as pointed out by DV, we were unnecessarily using a snprintf to a
buffer to execute qemu monitor commands, when we could get away with
just sending a static string.  Fix that as well.

Signed-off-by: Chris Lalancette clala...@redhat.com
diff --git a/src/qemu_driver.c b/src/qemu_driver.c
index a8a2ae7..11782b4 100644
--- a/src/qemu_driver.c
+++ b/src/qemu_driver.c
@@ -4331,6 +4331,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
 char cmd[HOST_NAME_MAX+50];
 char *info = NULL;
 int ret = -1;
+int paused = 0;
 
 qemuDriverLock(driver);
 vm = virDomainFindByID(driver-domains, dom-id);
@@ -4348,10 +4349,14 @@ qemudDomainMigratePerform (virDomainPtr dom,
 
 if (!(flags  VIR_MIGRATE_LIVE)) {
 /* Pause domain for non-live migration */
-snprintf(cmd, sizeof cmd, %s, stop);
-qemudMonitorCommand (vm, cmd, info);
+if (qemudMonitorCommand (vm, stop, info)  0) {
+qemudReportError(dom-conn, dom, NULL, VIR_ERR_OPERATION_FAILED,
+ %s, _(off-line migration specified, but suspend operation failed));
+goto cleanup;
+}
 DEBUG (stop reply: %s, info);
 VIR_FREE(info);
+paused = 1;
 
 event = virDomainEventNewFromObj(vm,
  VIR_DOMAIN_EVENT_SUSPENDED,
@@ -4396,6 +4401,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
 
 /* Clean up the source domain. */
 qemudShutdownVMDaemon (dom-conn, driver, vm);
+paused = 0;
 
 event = virDomainEventNewFromObj(vm,
  VIR_DOMAIN_EVENT_STOPPED,
@@ -4407,7 +4413,31 @@ qemudDomainMigratePerform (virDomainPtr dom,
 ret = 0;
 
 cleanup:
+/* Note that we have to free info *first*, since we are re-using the
+ * variable below (and otherwise might cause a memory leak)
+ */
 VIR_FREE(info);
+
+if (paused) {
+/* we got here through some sort of failure; start the domain again */
+if (qemudMonitorCommand (vm, cont, info)  0) {
+/* Hm, we already know we are in error here.  We don't want to
+ * overwrite the previous error, though, so we just throw something
+ * to the logs and hope for the best
+ */
+qemudLog(QEMUD_ERROR, _(Failed to resume guest %s after failure\n),
+ vm-def-name);
+}
+else {
+DEBUG (cont reply: %s, info);
+VIR_FREE(info);
+}
+
+event = virDomainEventNewFromObj(vm,
+ VIR_DOMAIN_EVENT_RESUMED,
+ VIR_DOMAIN_EVENT_RESUMED_MIGRATED);
+}
+
 if (vm)
 virDomainObjUnlock(vm);
 if (event)
--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH]: Fix non-live migration failure

2009-02-27 Thread Daniel Veillard
On Fri, Feb 27, 2009 at 01:12:40PM +0100, Chris Lalancette wrote:
 There is a logic error in the Qemu driver when doing a non-live migrate.
 During a non-live migrate, on the source host during the Perform step, we
 pause the domain; however, if there was ever a failure, we were forgetting
 to unpause the domain, meaning that the domain was paused forever.  Add a
 flag to tell us when we should unpause the domain after a failure.
 
 Also, as pointed out by DV, we were unnecessarily using a snprintf to a
 buffer to execute qemu monitor commands, when we could get away with
 just sending a static string.  Fix that as well.

  Oh right I had missed the leak :-)
Looks fine to me now,

Daniel


-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH]: Fix non-live migration failure

2009-02-27 Thread Chris Lalancette
Daniel Veillard wrote:
 On Fri, Feb 27, 2009 at 01:12:40PM +0100, Chris Lalancette wrote:
 There is a logic error in the Qemu driver when doing a non-live migrate.
 During a non-live migrate, on the source host during the Perform step, we
 pause the domain; however, if there was ever a failure, we were forgetting
 to unpause the domain, meaning that the domain was paused forever.  Add a
 flag to tell us when we should unpause the domain after a failure.

 Also, as pointed out by DV, we were unnecessarily using a snprintf to a
 buffer to execute qemu monitor commands, when we could get away with
 just sending a static string.  Fix that as well.
 
   Oh right I had missed the leak :-)
 Looks fine to me now,

Great, thanks for looking again.  Now committed.

-- 
Chris Lalancette

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH]: Fix non-live migration failure case

2009-02-26 Thread Daniel Veillard
On Wed, Feb 25, 2009 at 02:30:35PM +0100, Chris Lalancette wrote:
 All,
 There was a logic error in the Qemu driver when doing a non-live migrate.
 During a non-live migrate, on the source host during the Perform step, we
 pause the domain; however, if there was ever a failure, we were forgetting
 to unpause the domain, meaning that the domain was paused forever.  Add a
 flag to tell us when we should unpause the domain after a failure.

  Princile sounds fine ... but 

 Signed-off-by: Chris Lalancette clala...@redhat.com

 diff --git a/src/qemu_driver.c b/src/qemu_driver.c
 index a8a2ae7..bcc4690 100644
 --- a/src/qemu_driver.c
 +++ b/src/qemu_driver.c
 @@ -4331,6 +4331,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
  char cmd[HOST_NAME_MAX+50];
  char *info = NULL;
  int ret = -1;
 +int paused = 0;
  
  qemuDriverLock(driver);
  vm = virDomainFindByID(driver-domains, dom-id);
 @@ -4352,6 +4353,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
  qemudMonitorCommand (vm, cmd, info);
  DEBUG (stop reply: %s, info);
  VIR_FREE(info);
 +paused = 1;
  
  event = virDomainEventNewFromObj(vm,
   VIR_DOMAIN_EVENT_SUSPENDED,
 @@ -4407,6 +4409,21 @@ qemudDomainMigratePerform (virDomainPtr dom,
  ret = 0;
  
  cleanup:
 +if (ret != 0  paused) {
 +/* we got here through some sort of failure; start the domain again 
 */
 +snprintf(cmd, sizeof cmd, %s, cont);

  sizeof without braces ?

 +qemudMonitorCommand (vm, cmd, info);

   and why not just doing 
   qemudMonitorCommand (vm, cont, info);
and avoiding this ?
   and do we care about error handling there ?

 +DEBUG (cont reply: %s, info);
 +VIR_FREE(info);

  no need to do VIR_FREE(info) there since it's done below

 +event = virDomainEventNewFromObj(vm,
 + VIR_DOMAIN_EVENT_RESUMED,
 + VIR_DOMAIN_EVENT_RESUMED_MIGRATED);
 +if (event)
 +qemuDomainEventQueue(driver, event);
 +event = NULL;
 +}
 +
  VIR_FREE(info);

Daniel

-- 
Daniel Veillard  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
dan...@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH]: Fix non-live migration failure case

2009-02-26 Thread Chris Lalancette
Daniel Veillard wrote:
 On Wed, Feb 25, 2009 at 02:30:35PM +0100, Chris Lalancette wrote:
 All,
 There was a logic error in the Qemu driver when doing a non-live migrate.
 During a non-live migrate, on the source host during the Perform step, we
 pause the domain; however, if there was ever a failure, we were forgetting
 to unpause the domain, meaning that the domain was paused forever.  Add a
 flag to tell us when we should unpause the domain after a failure.
 
   Princile sounds fine ... but 
 
 Signed-off-by: Chris Lalancette clala...@redhat.com
 
 diff --git a/src/qemu_driver.c b/src/qemu_driver.c
 index a8a2ae7..bcc4690 100644
 --- a/src/qemu_driver.c
 +++ b/src/qemu_driver.c
 @@ -4331,6 +4331,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
  char cmd[HOST_NAME_MAX+50];
  char *info = NULL;
  int ret = -1;
 +int paused = 0;
  
  qemuDriverLock(driver);
  vm = virDomainFindByID(driver-domains, dom-id);
 @@ -4352,6 +4353,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
  qemudMonitorCommand (vm, cmd, info);
  DEBUG (stop reply: %s, info);
  VIR_FREE(info);
 +paused = 1;
  
  event = virDomainEventNewFromObj(vm,
   VIR_DOMAIN_EVENT_SUSPENDED,
 @@ -4407,6 +4409,21 @@ qemudDomainMigratePerform (virDomainPtr dom,
  ret = 0;
  
  cleanup:
 +if (ret != 0  paused) {
 +/* we got here through some sort of failure; start the domain again 
 */
 +snprintf(cmd, sizeof cmd, %s, cont);
 
   sizeof without braces ?
 
 +qemudMonitorCommand (vm, cmd, info);
 
and why not just doing 
qemudMonitorCommand (vm, cont, info);
 and avoiding this ?
and do we care about error handling there ?
 
 +DEBUG (cont reply: %s, info);
 +VIR_FREE(info);
 
   no need to do VIR_FREE(info) there since it's done below
 
 +event = virDomainEventNewFromObj(vm,
 + VIR_DOMAIN_EVENT_RESUMED,
 + VIR_DOMAIN_EVENT_RESUMED_MIGRATED);
 +if (event)
 +qemuDomainEventQueue(driver, event);
 +event = NULL;
 +}
 +
  VIR_FREE(info);

As we discussed on IRC, this was a quick cut-n-paste job, and I should have
looked at it better.  The place where I copied it from is also needlessly using
snprintf (I think), so I'll spin a new patch that fixes that too.  Your other
comments are dead-on, and also make me realize I introduced a memory leak here.
 I'll fix all of this up and re-post.

Thanks,
-- 
Chris Lalancette

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH]: Fix non-live migration failure case

2009-02-25 Thread Chris Lalancette
All,
There was a logic error in the Qemu driver when doing a non-live migrate.
During a non-live migrate, on the source host during the Perform step, we
pause the domain; however, if there was ever a failure, we were forgetting
to unpause the domain, meaning that the domain was paused forever.  Add a
flag to tell us when we should unpause the domain after a failure.

Signed-off-by: Chris Lalancette clala...@redhat.com
diff --git a/src/qemu_driver.c b/src/qemu_driver.c
index a8a2ae7..bcc4690 100644
--- a/src/qemu_driver.c
+++ b/src/qemu_driver.c
@@ -4331,6 +4331,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
 char cmd[HOST_NAME_MAX+50];
 char *info = NULL;
 int ret = -1;
+int paused = 0;
 
 qemuDriverLock(driver);
 vm = virDomainFindByID(driver-domains, dom-id);
@@ -4352,6 +4353,7 @@ qemudDomainMigratePerform (virDomainPtr dom,
 qemudMonitorCommand (vm, cmd, info);
 DEBUG (stop reply: %s, info);
 VIR_FREE(info);
+paused = 1;
 
 event = virDomainEventNewFromObj(vm,
  VIR_DOMAIN_EVENT_SUSPENDED,
@@ -4407,6 +4409,21 @@ qemudDomainMigratePerform (virDomainPtr dom,
 ret = 0;
 
 cleanup:
+if (ret != 0  paused) {
+/* we got here through some sort of failure; start the domain again */
+snprintf(cmd, sizeof cmd, %s, cont);
+qemudMonitorCommand (vm, cmd, info);
+DEBUG (cont reply: %s, info);
+VIR_FREE(info);
+
+event = virDomainEventNewFromObj(vm,
+ VIR_DOMAIN_EVENT_RESUMED,
+ VIR_DOMAIN_EVENT_RESUMED_MIGRATED);
+if (event)
+qemuDomainEventQueue(driver, event);
+event = NULL;
+}
+
 VIR_FREE(info);
 if (vm)
 virDomainObjUnlock(vm);
--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list