Re: [libvirt] [PATCH 04/19] qemu: Allow all query commands to be run during long jobs

2011-07-27 Thread Eric Blake

On 07/07/2011 05:34 PM, Jiri Denemark wrote:

Query commands are safe to be called during long running jobs (such as
migration). This patch makes them all work without the need to
special-case every single one of them.


Git bisect says that this was the culprit commit that broke 'virsh 
managedsave'.



+static int
  qemuDomainObjEnterMonitorInternal(struct qemud_driver *driver,
virDomainObjPtr obj)
  {
  qemuDomainObjPrivatePtr priv = obj->privateData;

+if (priv->job.active == QEMU_JOB_NONE&&  priv->job.asyncJob) {
+if (qemuDomainObjBeginNestedJob(obj)<  0)
+return -1;
+if (!virDomainObjIsActive(obj)) {
+qemuReportError(VIR_ERR_OPERATION_FAILED, "%s",
+_("domain is no longer running"));
+return -1;
+}
+}


I think this is the problem.  Doing a managed save will eventually make 
the qemu process go away, so we reach a point where we cannot issue a 
query monitor command to see how the save is progressing.  But this 
function only checks that vm is still active for QEMU_JOB_NONE, not for 
QEMU_ASYNC_JOB_SAVE.


I'm trying out a patch now...

--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 04/19] qemu: Allow all query commands to be run during long jobs

2011-07-11 Thread Daniel P. Berrange
On Fri, Jul 08, 2011 at 01:34:09AM +0200, Jiri Denemark wrote:
> Query commands are safe to be called during long running jobs (such as
> migration). This patch makes them all work without the need to
> special-case every single one of them.
> 
> The patch introduces new job.asyncCond condition and associated
> job.asyncJob which are dedicated to asynchronous (from qemu monitor
> point of view) jobs that can take arbitrarily long time to finish while
> qemu monitor is still usable for other commands.
> 
> The existing job.active (and job.cond condition) is used all other
> synchronous jobs (including the commands run during async job).
> 
> Locking schema is changed to use these two conditions. While asyncJob is
> active, only allowed set of synchronous jobs is allowed (the set can be
> different according to a particular asyncJob) so any method that
> communicates to qemu monitor needs to check if it is allowed to be
> executed during current asyncJob (if any). Once the check passes, the
> method needs to normally acquire job.cond to ensure no other command is
> running. Since domain object lock is released during that time, asyncJob
> could have been started in the meantime so the method needs to recheck
> the first condition. Then, normal jobs set job.active and asynchronous
> jobs set job.asyncJob and optionally change the list of allowed job
> groups.
> 
> Since asynchronous jobs only set job.asyncJob, other allowed commands
> can still be run when domain object is unlocked (when communicating to
> remote libvirtd or sleeping). To protect its own internal synchronous
> commands, the asynchronous job needs to start a special nested job
> before entering qemu monitor. The nested job doesn't check asyncJob, it
> only acquires job.cond and sets job.active to block other jobs.
> ---
>  src/qemu/qemu_domain.c|  219 
> +
>  src/qemu/qemu_domain.h|   82 +
>  src/qemu/qemu_driver.c|  122 +-
>  src/qemu/qemu_hotplug.c   |   38 
>  src/qemu/qemu_migration.c |  152 ++-
>  src/qemu/qemu_process.c   |   42 +
>  6 files changed, 439 insertions(+), 216 deletions(-)

ACK

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list