On 10.08.2020 20:40, Daniel Henrique Barboza wrote:
>
>
> On 7/23/20 7:14 AM, Nikolay Shirokovskiy wrote:
>> We are dropping the only reference here so that the event loop thread
>> is going to be exited synchronously. In order to avoid deadlocks we
>> need to unlock the VM so that any handler being called can finish
>> execution and thus even loop thread be finished too.
>>
>> Signed-off-by: Nikolay Shirokovskiy <nshirokovs...@virtuozzo.com>
>> ---
>> src/qemu/qemu_domain.c | 18 ++++++++++++++----
>> 1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 5b22eb2..82b3d11 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -1637,11 +1637,21 @@ void
>> qemuDomainObjStopWorker(virDomainObjPtr dom)
>> {
>> qemuDomainObjPrivatePtr priv = dom->privateData;
>> + virEventThread *eventThread;
>> - if (priv->eventThread) {
>> - g_object_unref(priv->eventThread);
>> - priv->eventThread = NULL;
>> - }
>> + if (!priv->eventThread)
>> + return;
>> +
>> + /*
>> + * We are dropping the only reference here so that the event loop thread
>> + * is going to be exited synchronously. In order to avoid deadlocks we
>> + * need to unlock the VM so that any handler being called can finish
>> + * execution and thus even loop thread be finished too.
>> + */
>> + eventThread = g_steal_pointer(&priv->eventThread);
>> + virObjectUnlock(dom);
>> + g_object_unref(eventThread);
>> + virObjectLock(dom);
>
> I understand the intention of the code thanks to the comment just before
> it, but this is not robust. This unlocking -> do stuff -> lock will only
> work if the caller of qemuDomainObjStopWorker() is holding the mutex.
>
> I see that this is the case for qemuDomainObjStopWorkerIter(), but
> qemuDomainObjStopWorker() is also called by qemuProcessStop().
> qemuProcessStop()
> does not make any mutex lock/unlock, so you'll be assuming that all callers of
> qemuProcessStop() will hold the mutex before calling it. One of them is
> qemuProcessInit(),
> which calls qemuProcessStop() in the 'stop' jump, and there is no mutex lock
> beforehand as well.
>
> Now we're assuming that all callers of qemuProcessInit() are holding the mutex
> lock beforehand. In a quick glance in the code I saw at least 2 instances that
> this isn't the case, then we'll need to assume that the callers of those
> functions
> are holding the mutex lock. So on and so forth.
>
>
> Given that this code only makes sense when called from
> qemuDomainObjStopWorkerIter(),
> I'd suggest removing the lock/unlock of this function (turning it into just a
> call
> to qemuDomainObjStopWorker()) and move them inside the body of
> qemuDomainObjStopWorker(),
> locking and unlocking the mutex inside the scope of the same function.
>
Hi, Daniel.
Actually all callers of qemuProcessStop hold the lock. Moreover they even hold
job condition. And calling qemuProcessStop without lock/job condition would be
a mistake AFIU (qemuConnectDomainXMLToNative is the only exception I see that
holds the lock but not the job condition. But this function creates new vm
object that is not shared with other threads) I understand you concern but
there are precedents. Take a look for example virDomainObjListRemove. The
lock requirements are documented for virDomainObjListRemove and I can do it for
qemuDomainObjStopWorker too but it looks a bit over documenting to me.
Nikolay