On Tue, Jan 25, 2011 at 02:43:43PM +0800, Wen Congyang wrote:
> The reason of libvirtd cores dump is that:
> We add vm->refs when we alloc the memory, and decrease it 
> in the function qemuHandleMonitorEOF() in other thread.
> 
> We add vm->refs in the function qemuConnectMonitor() and
> decrease it when the vm is inactive.
> 
> The libvirtd will block in the function qemuMonitorSetCapabilities()
> because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
> 
> Then we kill the vm by signal SIGKILL. The function
> qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs
> in the function qemuMonitorClose().
> In another thread, mon->fd is broken and the function
> qemuHandleMonitorEOF() is called. 
> 
> If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor()
> returns, vm->refs will be decrease to 0 and the memory is freed.
> 
> We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed.
> The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.
> 
> We will reference NULL pointer in the function 
> virDomainConfVMNWFilterTeardown():
> =============
> void
> virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) {
>     int i;
> 
>     if (nwfilterDriver != NULL) {
>         for (i = 0; i < vm->def->nnets; i++)
>             virDomainConfNWFilterTeardown(vm->def->nets[i]);
>     }
> }
> ============
> vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets
> to 0 when we free vm).
> 
> We should add an extra reference of vm to avoid vm to be deleted if
> qemuConnectMonitor() failed.
> 
> Signed-off-by: Wen Congyang <we...@cn.fujitsu.com>
> 
> ---
>  src/qemu/qemu_driver.c |   31 ++++++++++++++++++++++---------
>  1 files changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 34cc29f..613c2d4 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -930,6 +930,10 @@ qemuReconnectDomain(void *payload, const char *name 
> ATTRIBUTE_UNUSED, void *opaq
>  
>      priv = obj->privateData;
>  
> +    /* Hold an extra reference because we can't allow 'vm' to be
> +     * deleted if qemuConnectMonitor() failed */
> +    virDomainObjRef(obj);
> +
>      /* XXX check PID liveliness & EXE path */
>      if (qemuConnectMonitor(driver, obj) < 0)
>          goto error;
> @@ -961,18 +965,27 @@ qemuReconnectDomain(void *payload, const char *name 
> ATTRIBUTE_UNUSED, void *opaq
>      if (obj->def->id >= driver->nextvmid)
>          driver->nextvmid = obj->def->id + 1;
>  
> -    virDomainObjUnlock(obj);
> +    if (virDomainObjUnref(obj) > 0)
> +        virDomainObjUnlock(obj);
>      return;
>  
>  error:
> -    /* We can't get the monitor back, so must kill the VM
> -     * to remove danger of it ending up running twice if
> -     * user tries to start it again later */
> -    qemudShutdownVMDaemon(driver, obj, 0);
> -    if (!obj->persistent)
> -        virDomainRemoveInactive(&driver->domains, obj);
> -    else
> -        virDomainObjUnlock(obj);
> +    if (!virDomainObjIsActive(obj)) {
> +        if (virDomainObjUnref(obj) > 0)
> +            virDomainObjUnlock(obj);
> +        return;
> +    }
> +
> +    if (virDomainObjUnref(obj) > 0) {
> +        /* We can't get the monitor back, so must kill the VM
> +         * to remove danger of it ending up running twice if
> +         * user tries to start it again later */
> +        qemudShutdownVMDaemon(driver, obj, 0);
> +        if (!obj->persistent)
> +            virDomainRemoveInactive(&driver->domains, obj);
> +        else
> +            virDomainObjUnlock(obj);
> +    }
>  }

On closer examination I see why this change is required.
Normally we would be doing qemuDomainObjBeginJob before
doing anything with the monitor and that grabs an extra
reference.

ACK

Daniel

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to