Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-22 Thread John Ferlan

My Coverity scan found two issues both FORWARD_NULL...


qemuDomainLookupByName

and

qemuMigrationPrepareAny


On 12/16/2014 05:15 AM, Martin Kletzander wrote:
 There is one problem that causes various errors in the daemon.  When
 domain is waiting for a job, it is unlocked while waiting on the
 condition.  However, if that domain is for example transient and being
 removed in another API (e.g. cancelling incoming migration), it get's
 unref'd.  If the first call, that was waiting, fails to get the job, it
 unref's the domain object, and because it was the last reference, it
 causes clearing of the whole domain object.  However, when finishing the
 call, the domain must be unlocked, but there is no way for the API to
 know whether it was cleaned or not (unless there is some ugly temporary
 variable, but let's scratch that).
 
 The root cause is that our APIs don't ref the objects they are using and
 all use the implicit reference that the object has when it is in the
 domain list.  That reference can be removed when the API is waiting for
 a job.  And because each domain doesn't do its ref'ing, it results in
 the ugly checking of the return value of virObjectUnref() that we have
 everywhere.
 
 This patch changes qemuDomObjFromDomain() to ref the domain (using
 virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
 should be the only function in which the return value of
 virObjectUnref() is checked.  This makes all reference counting
 deterministic and makes the code a bit clearer.
 
 Signed-off-by: Martin Kletzander mklet...@redhat.com
 ---
 v2:
  - Comments from Peter and Daniel on v1 implemented, rather not
listing them here as the list was pretty comprehensive and it would
make the reviewer focus on that.
 
  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  49 ++--
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708 
 --
  src/qemu/qemu_migration.c | 111 +++-
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  77 ++---
  7 files changed, 371 insertions(+), 636 deletions(-)
 

...snip...

 @@ -1517,8 +1515,7 @@ static virDomainPtr 
 qemuDomainLookupByName(virConnectPtr conn,
  if (dom) dom-id = vm-def-id;
 
   cleanup:
 -if (vm)
 -virObjectUnlock(vm);
 +virObjectUnlock(vm);

Um... Did you mean the qemuDomObjEndAPI call here?


  return dom;
  }
 
...snip...

 @@ -3099,12 +3089,7 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
   * This prevents any other APIs being invoked while incoming
   * migration is taking place.
   */
 -if (!qemuMigrationJobContinue(vm)) {
 -vm = NULL;
 -virReportError(VIR_ERR_OPERATION_FAILED,
 -   %s, _(domain disappeared));
 -goto cleanup;
 -}
 +qemuMigrationJobContinue(vm);
 
  if (autoPort)
  priv-migrationPort = port;
 @@ -3115,16 +3100,12 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
  VIR_FREE(xmlout);
  VIR_FORCE_CLOSE(dataFD[0]);
  VIR_FORCE_CLOSE(dataFD[1]);
 -if (vm) {
 -if (ret  0) {
 -virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);
 -priv-nbdPort = 0;
 -}
 -if (ret = 0 || vm-persistent)
 -virObjectUnlock(vm);
 -else
 -qemuDomainRemoveInactive(driver, vm);
 +if (ret  0) {
 +virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);

We can get here with priv == NULL from numerous places...


 +priv-nbdPort = 0;
 +qemuDomainRemoveInactive(driver, vm);
  }
 +qemuDomObjEndAPI(vm);
  if (event)
  qemuDomainEventQueue(driver, event);
  qemuMigrationCookieFree(mig);
 @@ -3137,8 +3118,7 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
  qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED, 0);
 
   endjob:
 -if (!qemuMigrationJobFinish(driver, vm))
 -vm = NULL;
 +qemuMigrationJobFinish(driver, vm);
  goto cleanup;
  }
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-22 Thread Martin Kletzander

On Mon, Dec 22, 2014 at 07:27:29AM -0500, John Ferlan wrote:


My Coverity scan found two issues both FORWARD_NULL...


qemuDomainLookupByName

and

qemuMigrationPrepareAny


On 12/16/2014 05:15 AM, Martin Kletzander wrote:

There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
v2:
 - Comments from Peter and Daniel on v1 implemented, rather not
   listing them here as the list was pretty comprehensive and it would
   make the reviewer focus on that.

 src/qemu/THREADS.txt  |  40 ++-
 src/qemu/qemu_domain.c|  49 ++--
 src/qemu/qemu_domain.h|  12 +-
 src/qemu/qemu_driver.c| 708 --
 src/qemu/qemu_migration.c | 111 +++-
 src/qemu/qemu_migration.h |  10 +-
 src/qemu/qemu_process.c   |  77 ++---
 7 files changed, 371 insertions(+), 636 deletions(-)



...snip...


@@ -1517,8 +1515,7 @@ static virDomainPtr qemuDomainLookupByName(virConnectPtr 
conn,
 if (dom) dom-id = vm-def-id;

  cleanup:
-if (vm)
-virObjectUnlock(vm);
+virObjectUnlock(vm);


Um... Did you mean the qemuDomObjEndAPI call here?



No, because the vm was gotten using virDomainObjListFindByName() which
does not ref it.  But instead it should be kept as it was.




 return dom;
 }


...snip...


@@ -3099,12 +3089,7 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
  * This prevents any other APIs being invoked while incoming
  * migration is taking place.
  */
-if (!qemuMigrationJobContinue(vm)) {
-vm = NULL;
-virReportError(VIR_ERR_OPERATION_FAILED,
-   %s, _(domain disappeared));
-goto cleanup;
-}
+qemuMigrationJobContinue(vm);

 if (autoPort)
 priv-migrationPort = port;
@@ -3115,16 +3100,12 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
 VIR_FREE(xmlout);
 VIR_FORCE_CLOSE(dataFD[0]);
 VIR_FORCE_CLOSE(dataFD[1]);
-if (vm) {
-if (ret  0) {
-virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);
-priv-nbdPort = 0;
-}
-if (ret = 0 || vm-persistent)
-virObjectUnlock(vm);
-else
-qemuDomainRemoveInactive(driver, vm);
+if (ret  0) {
+virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);


We can get here with priv == NULL from numerous places...



Yes, good point.  The following diff should suffice for both issues,
right?  If you agree, do you want me to send a patch or just push it?

diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c
index 673d8a6..73a825d 100644
--- i/src/qemu/qemu_driver.c
+++ w/src/qemu/qemu_driver.c
@@ -1516,7 +1516,8 @@ static virDomainPtr qemuDomainLookupByName(virConnectPtr 
conn,
if (dom) dom-id = vm-def-id;

 cleanup:
-virObjectUnlock(vm);
+if (vm)
+virObjectUnlock(vm);
return dom;
}

diff --git i/src/qemu/qemu_migration.c w/src/qemu/qemu_migration.c
index 1db6630..77e0b35 100644
--- i/src/qemu/qemu_migration.c
+++ w/src/qemu/qemu_migration.c
@@ -3101,7 +3101,9 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
VIR_FREE(xmlout);
VIR_FORCE_CLOSE(dataFD[0]);
VIR_FORCE_CLOSE(dataFD[1]);
-if (ret  0) {
+if (ret  0  priv) {
+/* priv is set right after vm is added to the list of domains
+ * and there is no 'goto cleanup;' in the middle of those */
virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);
priv-nbdPort = 0;
qemuDomainRemoveInactive(driver, vm);
--

Martin


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-22 Thread John Ferlan


On 12/22/2014 04:02 PM, Martin Kletzander wrote:
 On Mon, Dec 22, 2014 at 07:27:29AM -0500, John Ferlan wrote:

 My Coverity scan found two issues both FORWARD_NULL...


 qemuDomainLookupByName

 and

 qemuMigrationPrepareAny


 On 12/16/2014 05:15 AM, Martin Kletzander wrote:
 There is one problem that causes various errors in the daemon.  When
 domain is waiting for a job, it is unlocked while waiting on the
 condition.  However, if that domain is for example transient and being
 removed in another API (e.g. cancelling incoming migration), it get's
 unref'd.  If the first call, that was waiting, fails to get the job, it
 unref's the domain object, and because it was the last reference, it
 causes clearing of the whole domain object.  However, when finishing the
 call, the domain must be unlocked, but there is no way for the API to
 know whether it was cleaned or not (unless there is some ugly temporary
 variable, but let's scratch that).

 The root cause is that our APIs don't ref the objects they are using and
 all use the implicit reference that the object has when it is in the
 domain list.  That reference can be removed when the API is waiting for
 a job.  And because each domain doesn't do its ref'ing, it results in
 the ugly checking of the return value of virObjectUnref() that we have
 everywhere.

 This patch changes qemuDomObjFromDomain() to ref the domain (using
 virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
 should be the only function in which the return value of
 virObjectUnref() is checked.  This makes all reference counting
 deterministic and makes the code a bit clearer.

 Signed-off-by: Martin Kletzander mklet...@redhat.com
 ---
 v2:
  - Comments from Peter and Daniel on v1 implemented, rather not
listing them here as the list was pretty comprehensive and it would
make the reviewer focus on that.

  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  49 ++--
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708
 --
  src/qemu/qemu_migration.c | 111 +++-
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  77 ++---
  7 files changed, 371 insertions(+), 636 deletions(-)


 ...snip...

 @@ -1517,8 +1515,7 @@ static virDomainPtr
 qemuDomainLookupByName(virConnectPtr conn,
  if (dom) dom-id = vm-def-id;

   cleanup:
 -if (vm)
 -virObjectUnlock(vm);
 +virObjectUnlock(vm);

 Um... Did you mean the qemuDomObjEndAPI call here?

 
 No, because the vm was gotten using virDomainObjListFindByName() which
 does not ref it.  But instead it should be kept as it was.
 

  return dom;
  }

 ...snip...

 @@ -3099,12 +3089,7 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
   * This prevents any other APIs being invoked while incoming
   * migration is taking place.
   */
 -if (!qemuMigrationJobContinue(vm)) {
 -vm = NULL;
 -virReportError(VIR_ERR_OPERATION_FAILED,
 -   %s, _(domain disappeared));
 -goto cleanup;
 -}
 +qemuMigrationJobContinue(vm);

  if (autoPort)
  priv-migrationPort = port;
 @@ -3115,16 +3100,12 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
  VIR_FREE(xmlout);
  VIR_FORCE_CLOSE(dataFD[0]);
  VIR_FORCE_CLOSE(dataFD[1]);
 -if (vm) {
 -if (ret  0) {
 -virPortAllocatorRelease(driver-migrationPorts,
 priv-nbdPort);
 -priv-nbdPort = 0;
 -}
 -if (ret = 0 || vm-persistent)
 -virObjectUnlock(vm);
 -else
 -qemuDomainRemoveInactive(driver, vm);
 +if (ret  0) {
 +virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);

 We can get here with priv == NULL from numerous places...

 
 Yes, good point.  The following diff should suffice for both issues,
 right?  If you agree, do you want me to send a patch or just push it?
 
 diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c
 index 673d8a6..73a825d 100644
 --- i/src/qemu/qemu_driver.c
 +++ w/src/qemu/qemu_driver.c
 @@ -1516,7 +1516,8 @@ static virDomainPtr
 qemuDomainLookupByName(virConnectPtr conn,
 if (dom) dom-id = vm-def-id;
 
  cleanup:
 -virObjectUnlock(vm);
 +if (vm)
 +virObjectUnlock(vm);
 return dom;
 }
 
 diff --git i/src/qemu/qemu_migration.c w/src/qemu/qemu_migration.c
 index 1db6630..77e0b35 100644
 --- i/src/qemu/qemu_migration.c
 +++ w/src/qemu/qemu_migration.c
 @@ -3101,7 +3101,9 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver,
 VIR_FREE(xmlout);
 VIR_FORCE_CLOSE(dataFD[0]);
 VIR_FORCE_CLOSE(dataFD[1]);
 -if (ret  0) {
 +if (ret  0  priv) {
 +/* priv is set right after vm is added to the list of domains
 + * and there is no 'goto cleanup;' in the middle of those */
 virPortAllocatorRelease(driver-migrationPorts, priv-nbdPort);
 priv-nbdPort = 0;
 qemuDomainRemoveInactive(driver, vm);

Makes 

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-21 Thread Martin Kletzander

On Tue, Dec 16, 2014 at 04:51:59PM +0100, Martin Kletzander wrote:

..., but I'll wait with the pushing so others have a chance to
react.



There's no reaction for a while now and since I'd like to push this in
the early stage of a release cycle, I'm pushing it now.  This touches
almost all APIs and hence we want some time to fix things nobody
noticed yet.

Martin


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-16 Thread Martin Kletzander
There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
v2:
 - Comments from Peter and Daniel on v1 implemented, rather not
   listing them here as the list was pretty comprehensive and it would
   make the reviewer focus on that.

 src/qemu/THREADS.txt  |  40 ++-
 src/qemu/qemu_domain.c|  49 ++--
 src/qemu/qemu_domain.h|  12 +-
 src/qemu/qemu_driver.c| 708 --
 src/qemu/qemu_migration.c | 111 +++-
 src/qemu/qemu_migration.h |  10 +-
 src/qemu/qemu_process.c   |  77 ++---
 7 files changed, 371 insertions(+), 636 deletions(-)

diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
index 50a0cf9..add2a35 100644
--- a/src/qemu/THREADS.txt
+++ b/src/qemu/THREADS.txt
@@ -26,12 +26,20 @@ There are a number of locks on various objects
   * virDomainObjPtr

 Will be locked after calling any of the virDomainFindBy{ID,Name,UUID}
-methods.
+methods.  However, preferred method is qemuDomObjFromDomain() that uses
+virDomainFindByUUIDRef() which also increases the reference counter and
+finds the domain in the domain list without blocking all other lookups.
+When the domain is locked and the reference increased, the prefered way of
+decrementing the reference counter and unlocking the domain is using the
+qemuDomObjEndAPI() function.

 Lock must be held when changing/reading any variable in the virDomainObjPtr

 If the lock needs to be dropped  then re-acquired for a short period of
 time, the reference count must be incremented first using 
virDomainObjRef().
+There is no need to increase the reference count if qemuDomObjFromDomain()
+was used for looking up the domain.  In this case there is one reference
+already added by that function.

 This lock must not be held for anything which sleeps/waits (i.e. monitor
 commands).
@@ -109,7 +117,6 @@ To lock the virDomainObjPtr
 To acquire the normal job condition

   qemuDomainObjBeginJob()
-- Increments ref count on virDomainObjPtr
 - Waits until the job is compatible with current async job or no
   async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
@@ -122,14 +129,12 @@ To acquire the normal job condition
   qemuDomainObjEndJob()
 - Sets job.active to 0
 - Signals on job.cond condition
-- Decrements ref count on virDomainObjPtr



 To acquire the asynchronous job condition

   qemuDomainObjBeginAsyncJob()
-- Increments ref count on virDomainObjPtr
 - Waits until no async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
   mutex
@@ -141,7 +146,6 @@ To acquire the asynchronous job condition
   qemuDomainObjEndAsyncJob()
 - Sets job.asyncJob to 0
 - Broadcasts on job.asyncCond condition
-- Decrements ref count on virDomainObjPtr



@@ -179,12 +183,10 @@ To acquire the QEMU monitor lock as part of an 
asynchronous job
 To keep a domain alive while waiting on a remote command

   qemuDomainObjEnterRemote()
-- Increments ref count on virDomainObjPtr
 - Releases the virDomainObjPtr lock

   qemuDomainObjExitRemote()
 - Acquires the virDomainObjPtr lock
-- Decrements ref count on virDomainObjPtr


 Design patterns
@@ -195,18 +197,18 @@ Design patterns

  virDomainObjPtr obj;

- obj = virDomainFindByUUID(driver-domains, dom-uuid);
+ obj = qemuDomObjFromDomain(dom);

  ...do work...

- virDomainObjUnlock(obj);
+ qemuDomObjEndAPI(obj);


  * Updating something directly to do with a virDomainObjPtr

  virDomainObjPtr obj;

- obj = 

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-16 Thread Michal Privoznik

On 16.12.2014 11:15, Martin Kletzander wrote:

There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
v2:
  - Comments from Peter and Daniel on v1 implemented, rather not
listing them here as the list was pretty comprehensive and it would
make the reviewer focus on that.

  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  49 ++--
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708 --
  src/qemu/qemu_migration.c | 111 +++-
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  77 ++---
  7 files changed, 371 insertions(+), 636 deletions(-)




diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index f652237..30ea9df 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c



@@ -4409,8 +4333,7 @@ static void qemuProcessEventHandler(void *data, void 
*opaque)
  break;
  }

-if (virObjectUnref(vm))
-virObjectUnlock(vm);
+qemuDomObjEndAPI(vm);


Interesting, why the heck have we Unref() here? I mean, in the calltrace 
I don't see where the @vm is Ref()-ed. And this should not use the 
reference obtained via qemuMonitorOpen() as we are connecting only once 
and receiving events multiple times. But this is pre-existing code.



  VIR_FREE(processEvent);
  }




diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index a19e71a..4d9ce09 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c



@@ -5491,6 +5470,7 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
virConnectPtr conn)
  {
  VIR_DEBUG(vm=%s, conn=%p, vm-def-name, conn);
+virObjectRef(vm);
  return virCloseCallbacksSet(driver-closeCallbacks, vm, conn,
  qemuProcessAutoDestroy);


This function is called conditionally in qemuProcessStart() depending on 
VIR_QEMU_PROCESS_START_AUTODESTROY flag passed...



  }
@@ -5498,9 +5478,12 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
  int qemuProcessAutoDestroyRemove(virQEMUDriverPtr driver,
   virDomainObjPtr vm)
  {
+int ret;
  VIR_DEBUG(vm=%s, vm-def-name);
-return virCloseCallbacksUnset(driver-closeCallbacks, vm,
-  qemuProcessAutoDestroy);
+ret = virCloseCallbacksUnset(driver-closeCallbacks, vm,
+ qemuProcessAutoDestroy);
+virObjectUnref(vm);
+return ret;


.. on the other hand, this function is called unconditionally in 
qemuProcessStop(). So after 'virsh start' and 'virsh destroy' the 
reference counts don't match. I've been able to debug this and I came up 
with this diff:


diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 4d9ce09..cbd0389 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -5470,7 +5470,6 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
   virConnectPtr conn)
 {
 VIR_DEBUG(vm=%s, conn=%p, vm-def-name, conn);
-virObjectRef(vm);
 return virCloseCallbacksSet(driver-closeCallbacks, vm, conn,
 qemuProcessAutoDestroy);
 }
@@ -5482,7 +5481,6 @@ int qemuProcessAutoDestroyRemove(virQEMUDriverPtr 
driver,

 VIR_DEBUG(vm=%s, vm-def-name);
 ret = virCloseCallbacksUnset(driver-closeCallbacks, vm,
  qemuProcessAutoDestroy);
-virObjectUnref(vm);
 return ret;
 }

diff --git a/src/util/virclosecallbacks.c b/src/util/virclosecallbacks.c
index 4f26172..4128057 100644
--- a/src/util/virclosecallbacks.c
+++ b/src/util/virclosecallbacks.c
@@ 

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-16 Thread Martin Kletzander

On Tue, Dec 16, 2014 at 02:00:09PM +0100, Michal Privoznik wrote:

On 16.12.2014 11:15, Martin Kletzander wrote:

There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
v2:
  - Comments from Peter and Daniel on v1 implemented, rather not
listing them here as the list was pretty comprehensive and it would
make the reviewer focus on that.

  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  49 ++--
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708 --
  src/qemu/qemu_migration.c | 111 +++-
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  77 ++---
  7 files changed, 371 insertions(+), 636 deletions(-)




diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index f652237..30ea9df 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c



@@ -4409,8 +4333,7 @@ static void qemuProcessEventHandler(void *data, void 
*opaque)
  break;
  }

-if (virObjectUnref(vm))
-virObjectUnlock(vm);
+qemuDomObjEndAPI(vm);


Interesting, why the heck have we Unref() here? I mean, in the calltrace
I don't see where the @vm is Ref()-ed. And this should not use the
reference obtained via qemuMonitorOpen() as we are connecting only once
and receiving events multiple times. But this is pre-existing code.



You won't see that as that is done in another thread.  Each one of
processing methods (e.g. qemuProcessHandleDeviceDeleted,
qemuProcessHandleGuestPanic) reference the VM themselves before
unlocking it to make sure it is not getting disposed of before the
event is processed.  And that processing is left to the worker pool
that does qemuProcessEventHandler().


  VIR_FREE(processEvent);
  }




diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index a19e71a..4d9ce09 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c



@@ -5491,6 +5470,7 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
virConnectPtr conn)
  {
  VIR_DEBUG(vm=%s, conn=%p, vm-def-name, conn);
+virObjectRef(vm);
  return virCloseCallbacksSet(driver-closeCallbacks, vm, conn,
  qemuProcessAutoDestroy);


This function is called conditionally in qemuProcessStart() depending on
VIR_QEMU_PROCESS_START_AUTODESTROY flag passed...


  }
@@ -5498,9 +5478,12 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
  int qemuProcessAutoDestroyRemove(virQEMUDriverPtr driver,
   virDomainObjPtr vm)
  {
+int ret;
  VIR_DEBUG(vm=%s, vm-def-name);
-return virCloseCallbacksUnset(driver-closeCallbacks, vm,
-  qemuProcessAutoDestroy);
+ret = virCloseCallbacksUnset(driver-closeCallbacks, vm,
+ qemuProcessAutoDestroy);
+virObjectUnref(vm);
+return ret;


.. on the other hand, this function is called unconditionally in
qemuProcessStop(). So after 'virsh start' and 'virsh destroy' the
reference counts don't match. I've been able to debug this and I came up
with this diff:



This diff seems accurate even for use outside of qemu.  I'm squashing
it in, but I'll wait with the pushing so others have a chance to
react.

Thanks for the review,
Martin


diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 4d9ce09..cbd0389 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -5470,7 +5470,6 @@ int qemuProcessAutoDestroyAdd(virQEMUDriverPtr driver,
   virConnectPtr conn)
 {
 VIR_DEBUG(vm=%s, conn=%p, vm-def-name, conn);
-

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-09 Thread Martin Kletzander

On Mon, Dec 08, 2014 at 07:57:50PM +0100, Peter Krempa wrote:

On 12/05/14 15:03, Martin Kletzander wrote:

There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
 src/qemu/THREADS.txt  |  40 ++-
 src/qemu/qemu_domain.c|  29 +-
 src/qemu/qemu_domain.h|  12 +-
 src/qemu/qemu_driver.c| 708 --
 src/qemu/qemu_migration.c | 108 +++
 src/qemu/qemu_migration.h |  10 +-
 src/qemu/qemu_process.c   |  87 +++---
 7 files changed, 359 insertions(+), 635 deletions(-)

diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
index 50a0cf9..53a3415 100644
--- a/src/qemu/THREADS.txt
+++ b/src/qemu/THREADS.txt


...


@@ -109,7 +117,6 @@ To lock the virDomainObjPtr
 To acquire the normal job condition

   qemuDomainObjBeginJob()


*


-- Increments ref count on virDomainObjPtr
 - Waits until the job is compatible with current async job or no
   async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
@@ -122,14 +129,12 @@ To acquire the normal job condition
   qemuDomainObjEndJob()
 - Sets job.active to 0
 - Signals on job.cond condition
-- Decrements ref count on virDomainObjPtr



 To acquire the asynchronous job condition

   qemuDomainObjBeginAsyncJob()
-- Increments ref count on virDomainObjPtr


Should we mention that having a ref already is pre-requisite for calling
Begin*Job?



I take it as from this patch onwards it is pre-requisite to have a ref
for doing anything.


 - Waits until no async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
   mutex


...


@@ -179,12 +183,10 @@ To acquire the QEMU monitor lock as part of an 
asynchronous job
 To keep a domain alive while waiting on a remote command

   qemuDomainObjEnterRemote()


Same here ?



Same here.

Maybe it would be nice to document that the only way how to obtain
virDomainPtr is with its own reference.  Would that help?


-- Increments ref count on virDomainObjPtr
 - Releases the virDomainObjPtr lock

   qemuDomainObjExitRemote()
 - Acquires the virDomainObjPtr lock
-- Decrements ref count on virDomainObjPtr


 Design patterns



@@ -195,18 +197,18 @@ Design patterns

  virDomainObjPtr obj;

- obj = virDomainFindByUUID(driver-domains, dom-uuid);
+ obj = qemuDomObjFromDomain(dom);

  ...do work...

- virDomainObjUnlock(obj);
+ qemuDomObjEndAPI(obj);


obj to match the code.



I guess I missed some 'git add' after one of the 'sed -s' :)


diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 220304f..d4a6021 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -1318,8 +1318,6 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
 priv-jobs_queued++;
 then = now + QEMU_JOB_WAIT_TIME;

-virObjectRef(obj);
-
  retry:
 if (cfg-maxQueuedJobs 
 priv-jobs_queued  cfg-maxQueuedJobs) {
@@ -1398,7 +1396,6 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,

  cleanup:
 priv-jobs_queued--;
-virObjectUnref(obj);
 virObjectUnref(cfg);
 return ret;
 }
@@ -1467,7 +1464,8 @@ qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver,
  * Returns true if @obj was still referenced, false if it was
  * disposed of.


Comment is no longer valid, also mentioning that caller should hold a
reference would be helpful.



Yes, this should be removed.  About the reference, see above.


  */
-bool qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
+void
+qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
 {
 qemuDomainObjPrivatePtr priv = 

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-09 Thread Peter Krempa
On 12/05/14 15:03, Martin Kletzander wrote:
 There is one problem that causes various errors in the daemon.  When
 domain is waiting for a job, it is unlocked while waiting on the
 condition.  However, if that domain is for example transient and being
 removed in another API (e.g. cancelling incoming migration), it get's
 unref'd.  If the first call, that was waiting, fails to get the job, it
 unref's the domain object, and because it was the last reference, it
 causes clearing of the whole domain object.  However, when finishing the
 call, the domain must be unlocked, but there is no way for the API to
 know whether it was cleaned or not (unless there is some ugly temporary
 variable, but let's scratch that).
 
 The root cause is that our APIs don't ref the objects they are using and
 all use the implicit reference that the object has when it is in the
 domain list.  That reference can be removed when the API is waiting for
 a job.  And because each domain doesn't do its ref'ing, it results in
 the ugly checking of the return value of virObjectUnref() that we have
 everywhere.
 
 This patch changes qemuDomObjFromDomain() to ref the domain (using
 virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
 should be the only function in which the return value of
 virObjectUnref() is checked.  This makes all reference counting
 deterministic and makes the code a bit clearer.
 
 Signed-off-by: Martin Kletzander mklet...@redhat.com
 ---
  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  29 +-
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708 
 --
  src/qemu/qemu_migration.c | 108 +++
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  87 +++---
  7 files changed, 359 insertions(+), 635 deletions(-)
 

Continuing where I finished yesterday:

 diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
 index 9152cf5..2e0b7fc 100644
 --- a/src/qemu/qemu_driver.c
 +++ b/src/qemu/qemu_driver.c

[...]

 @@ -15128,21 +14970,18 @@ static virDomainPtr 
 qemuDomainQemuAttach(virConnectPtr conn,
 VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
 NULL)))
  goto cleanup;
 -

Keep the empty line for visual separation.

 +virObjectRef(vm);
  def = NULL;
 
  if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY)  0) {
  qemuDomainRemoveInactive(driver, vm);
 -vm = NULL;
  goto cleanup;
  }
 
  if (qemuProcessAttach(conn, driver, vm, pid,
pidfile, monConfig, monJSON)  0) {
 -if (qemuDomainObjEndJob(driver, vm))
 -qemuDomainRemoveInactive(driver, vm);
 -vm = NULL;
 -monConfig = NULL;

Ownership of monConfig is transferred to qemuProcessAttach(), thus you
need to clean the pointer to avoid double free.

 +qemuDomainObjEndJob(driver, vm);
 +qemuDomainRemoveInactive(driver, vm);
  goto cleanup;
  }
 

[...]

 @@ -15513,8 +15348,7 @@ qemuDomainBlockPivot(virConnectPtr conn,
  }
 
 
 -/* bandwidth in MiB/s per public API. Caller must lock vm beforehand,
 - * and not access it afterwards.  */
 +/* bandwidth in MiB/s per public API. Caller must lock vm beforehand. */

The semantics didn't change. qemuDomainBlockJobImpl() consumes vm and
calls qemuDomObjEndAPI() on it without adding any reference thus callers
still shouldn't touch VM after passing it to this function.

  static int
  qemuDomainBlockJobImpl(virDomainObjPtr vm,
 virConnectPtr conn,

[...]


 @@ -15776,7 +15606,7 @@ qemuDomainBlockJobAbort(virDomainPtr dom, const char 
 *path, unsigned int flags)
 
  static int
  qemuDomainGetBlockJobInfo(virDomainPtr dom, const char *path,
 -   virDomainBlockJobInfoPtr info, unsigned int flags)
 +  virDomainBlockJobInfoPtr info, unsigned int flags)

Unrelated whitespace fix?

  {
  virQEMUDriverPtr driver = dom-conn-privateData;
  qemuDomainObjPrivatePtr priv;

[...]

 @@ -18733,13 +18523,15 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
  continue;
 
  if (doms != domlist 
 -!virConnectGetAllDomainStatsCheckACL(conn, dom-def))
 +!virConnectGetAllDomainStatsCheckACL(conn, dom-def)) {
 +virObjectUnlock(dom);

Hmm, this should be a separate fix as I've done with the migration APIs.
Or perhaps is this a rebase artifact from a un-published patch?

  continue;
 +}
 
 -if (HAVE_JOB(domflags) 
 -qemuDomainObjBeginJob(driver, dom, QEMU_JOB_QUERY)  0)
 +if (HAVE_JOB(privflags) 
 +qemuDomainObjBeginJob(driver, dom, QEMU_JOB_QUERY) = 0)
  /* As it was never requested. Gather as much as possible anyway. 
 */
 -domflags = ~QEMU_DOMAIN_STATS_HAVE_JOB;
 +domflags |= QEMU_DOMAIN_STATS_HAVE_JOB;
 
  if 

Re: [libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-08 Thread Peter Krempa
On 12/05/14 15:03, Martin Kletzander wrote:
 There is one problem that causes various errors in the daemon.  When
 domain is waiting for a job, it is unlocked while waiting on the
 condition.  However, if that domain is for example transient and being
 removed in another API (e.g. cancelling incoming migration), it get's
 unref'd.  If the first call, that was waiting, fails to get the job, it
 unref's the domain object, and because it was the last reference, it
 causes clearing of the whole domain object.  However, when finishing the
 call, the domain must be unlocked, but there is no way for the API to
 know whether it was cleaned or not (unless there is some ugly temporary
 variable, but let's scratch that).
 
 The root cause is that our APIs don't ref the objects they are using and
 all use the implicit reference that the object has when it is in the
 domain list.  That reference can be removed when the API is waiting for
 a job.  And because each domain doesn't do its ref'ing, it results in
 the ugly checking of the return value of virObjectUnref() that we have
 everywhere.
 
 This patch changes qemuDomObjFromDomain() to ref the domain (using
 virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
 should be the only function in which the return value of
 virObjectUnref() is checked.  This makes all reference counting
 deterministic and makes the code a bit clearer.
 
 Signed-off-by: Martin Kletzander mklet...@redhat.com
 ---
  src/qemu/THREADS.txt  |  40 ++-
  src/qemu/qemu_domain.c|  29 +-
  src/qemu/qemu_domain.h|  12 +-
  src/qemu/qemu_driver.c| 708 
 --
  src/qemu/qemu_migration.c | 108 +++
  src/qemu/qemu_migration.h |  10 +-
  src/qemu/qemu_process.c   |  87 +++---
  7 files changed, 359 insertions(+), 635 deletions(-)
 
 diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
 index 50a0cf9..53a3415 100644
 --- a/src/qemu/THREADS.txt
 +++ b/src/qemu/THREADS.txt

...

 @@ -109,7 +117,6 @@ To lock the virDomainObjPtr
  To acquire the normal job condition
 
qemuDomainObjBeginJob()

*

 -- Increments ref count on virDomainObjPtr
  - Waits until the job is compatible with current async job or no
async job is running
  - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
 @@ -122,14 +129,12 @@ To acquire the normal job condition
qemuDomainObjEndJob()
  - Sets job.active to 0
  - Signals on job.cond condition
 -- Decrements ref count on virDomainObjPtr
 
 
 
  To acquire the asynchronous job condition
 
qemuDomainObjBeginAsyncJob()
 -- Increments ref count on virDomainObjPtr

Should we mention that having a ref already is pre-requisite for calling
Begin*Job?

  - Waits until no async job is running
  - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
mutex

...

 @@ -179,12 +183,10 @@ To acquire the QEMU monitor lock as part of an 
 asynchronous job
  To keep a domain alive while waiting on a remote command
 
qemuDomainObjEnterRemote()

Same here ?

 -- Increments ref count on virDomainObjPtr
  - Releases the virDomainObjPtr lock
 
qemuDomainObjExitRemote()
  - Acquires the virDomainObjPtr lock
 -- Decrements ref count on virDomainObjPtr
 
 
  Design patterns

 @@ -195,18 +197,18 @@ Design patterns
 
   virDomainObjPtr obj;
 
 - obj = virDomainFindByUUID(driver-domains, dom-uuid);
 + obj = qemuDomObjFromDomain(dom);
 
   ...do work...
 
 - virDomainObjUnlock(obj);
 + qemuDomObjEndAPI(obj);

obj to match the code.

 
 
   * Updating something directly to do with a virDomainObjPtr
 
   virDomainObjPtr obj;
 
 - obj = virDomainFindByUUID(driver-domains, dom-uuid);
 + obj = qemuDomObjFromDomain(dom);
 
   qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
 
 @@ -214,18 +216,15 @@ Design patterns
 
   qemuDomainObjEndJob(obj);
 
 - virDomainObjUnlock(obj);
 -
 -
 + qemuDomObjEndAPI(obj);

obj ...

 
 
   * Invoking a monitor command on a virDomainObjPtr
 
 -
   virDomainObjPtr obj;
   qemuDomainObjPrivatePtr priv;
 
 - obj = virDomainFindByUUID(driver-domains, dom-uuid);
 + obj = qemuDomObjFromDomain(dom);
 
   qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);
 
 @@ -240,8 +239,7 @@ Design patterns
   ...do final work...
 
   qemuDomainObjEndJob(obj);
 - virDomainObjUnlock(obj);
 -
 + qemuDomObjEndAPI(obj);

obj ...

 
 
   * Running asynchronous job
 @@ -249,7 +247,7 @@ Design patterns
   virDomainObjPtr obj;
   qemuDomainObjPrivatePtr priv;
 
 - obj = virDomainFindByUUID(driver-domains, dom-uuid);
 + obj = qemuDomObjFromDomain(dom);
 
   qemuDomainObjBeginAsyncJob(obj, QEMU_ASYNC_JOB_TYPE);
   qemuDomainObjSetAsyncJobMask(obj, allowedJobs);
 @@ -281,7 +279,7 @@ Design patterns
   ...do final work...
 
   qemuDomainObjEndAsyncJob(obj);
 - virDomainObjUnlock(obj);
 + qemuDomObjEndAPI(obj);

obj 

[libvirt] [PATCH v2 2/2] qemu: completely rework reference counting

2014-12-05 Thread Martin Kletzander
There is one problem that causes various errors in the daemon.  When
domain is waiting for a job, it is unlocked while waiting on the
condition.  However, if that domain is for example transient and being
removed in another API (e.g. cancelling incoming migration), it get's
unref'd.  If the first call, that was waiting, fails to get the job, it
unref's the domain object, and because it was the last reference, it
causes clearing of the whole domain object.  However, when finishing the
call, the domain must be unlocked, but there is no way for the API to
know whether it was cleaned or not (unless there is some ugly temporary
variable, but let's scratch that).

The root cause is that our APIs don't ref the objects they are using and
all use the implicit reference that the object has when it is in the
domain list.  That reference can be removed when the API is waiting for
a job.  And because each domain doesn't do its ref'ing, it results in
the ugly checking of the return value of virObjectUnref() that we have
everywhere.

This patch changes qemuDomObjFromDomain() to ref the domain (using
virDomainObjListFindByUUIDRef()) and adds qemuDomObjEndAPI() which
should be the only function in which the return value of
virObjectUnref() is checked.  This makes all reference counting
deterministic and makes the code a bit clearer.

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
 src/qemu/THREADS.txt  |  40 ++-
 src/qemu/qemu_domain.c|  29 +-
 src/qemu/qemu_domain.h|  12 +-
 src/qemu/qemu_driver.c| 708 --
 src/qemu/qemu_migration.c | 108 +++
 src/qemu/qemu_migration.h |  10 +-
 src/qemu/qemu_process.c   |  87 +++---
 7 files changed, 359 insertions(+), 635 deletions(-)

diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
index 50a0cf9..53a3415 100644
--- a/src/qemu/THREADS.txt
+++ b/src/qemu/THREADS.txt
@@ -26,12 +26,20 @@ There are a number of locks on various objects
   * virDomainObjPtr

 Will be locked after calling any of the virDomainFindBy{ID,Name,UUID}
-methods.
+methods.  However, preferred method is qemuDomObjFromDomain() that uses
+virDomainFindByUUIDRef() which also increases the reference counter and
+finds the domain in the domain list without blocking all other lookups.
+When the domain is locked and the reference increased, the prefered way of
+decrementing the reference counter and unlocking the domain is using the
+qemuDomObjEndAPI() function.

 Lock must be held when changing/reading any variable in the virDomainObjPtr

 If the lock needs to be dropped  then re-acquired for a short period of
 time, the reference count must be incremented first using 
virDomainObjRef().
+There is no need to increase the reference count if qemuDomObjFromDomain()
+was used for looking up the domain.  In this case there is one reference
+already added by that function.

 This lock must not be held for anything which sleeps/waits (i.e. monitor
 commands).
@@ -109,7 +117,6 @@ To lock the virDomainObjPtr
 To acquire the normal job condition

   qemuDomainObjBeginJob()
-- Increments ref count on virDomainObjPtr
 - Waits until the job is compatible with current async job or no
   async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
@@ -122,14 +129,12 @@ To acquire the normal job condition
   qemuDomainObjEndJob()
 - Sets job.active to 0
 - Signals on job.cond condition
-- Decrements ref count on virDomainObjPtr



 To acquire the asynchronous job condition

   qemuDomainObjBeginAsyncJob()
-- Increments ref count on virDomainObjPtr
 - Waits until no async job is running
 - Waits for job.cond condition 'job.active != 0' using virDomainObjPtr
   mutex
@@ -141,7 +146,6 @@ To acquire the asynchronous job condition
   qemuDomainObjEndAsyncJob()
 - Sets job.asyncJob to 0
 - Broadcasts on job.asyncCond condition
-- Decrements ref count on virDomainObjPtr



@@ -179,12 +183,10 @@ To acquire the QEMU monitor lock as part of an 
asynchronous job
 To keep a domain alive while waiting on a remote command

   qemuDomainObjEnterRemote()
-- Increments ref count on virDomainObjPtr
 - Releases the virDomainObjPtr lock

   qemuDomainObjExitRemote()
 - Acquires the virDomainObjPtr lock
-- Decrements ref count on virDomainObjPtr


 Design patterns
@@ -195,18 +197,18 @@ Design patterns

  virDomainObjPtr obj;

- obj = virDomainFindByUUID(driver-domains, dom-uuid);
+ obj = qemuDomObjFromDomain(dom);

  ...do work...

- virDomainObjUnlock(obj);
+ qemuDomObjEndAPI(obj);


  * Updating something directly to do with a virDomainObjPtr

  virDomainObjPtr obj;

- obj = virDomainFindByUUID(driver-domains, dom-uuid);
+ obj = qemuDomObjFromDomain(dom);

  qemuDomainObjBeginJob(obj, QEMU_JOB_TYPE);

@@ -214,18 +216,15 @@ Design patterns

  qemuDomainObjEndJob(obj);

-