On 05/16/2011 07:30 PM, Eric Blake wrote: > + while (priv->jobSignals& QEMU_JOB_SIGNAL_BLKSTAT) > + ignore_value(virCondWait(&priv->signalCond,&vm->lock)); > + > + priv->jobSignalsData.statDevName = disk->info.alias; > + priv->jobSignalsData.blockStat = stats; > + priv->jobSignalsData.statRetCode = -1; > + priv->jobSignals |= QEMU_JOB_SIGNAL_BLKSTAT; > + > + while (priv->jobSignals& QEMU_JOB_SIGNAL_BLKSTAT) > + ignore_value(virCondWait(&priv->signalCond,&vm->lock));
I'm not sure that the original problem with jobSignals is fixed (the one that you solved by splitting the retcode field). A similar race can happen with two threads, both requesting blkstat. Thread 1 starts a migration. Thread 2 queries BlockStats, waits for the condition variable and priv->jobSignals to be 0. Thread 2 sets priv->jobSignals to non-zero, then waits for condition variable and priv->jobSignals to clear BLKSTATS. Thread 1 processes thread 2's query, and clears priv->jobSignals Thread 3 sees priv->jobSignals == 0 and queries BlockStats. Thread 1 processes thread 3's query, and clears priv->jobSignals Thread 2 gets condition lock, but sees the answer intended for thread 3. At this point I'm not even sure if there is a deadlock or what. I think the code should use two condition variables. signalCond is broadcast when you _can start_ a job, resultCond is broadcast when a job has finished. while ((priv->jobSignals | priv->jobResults) & QEMU_JOB_SIGNAL_BLKSTAT) ignore_value(virCondWait(&priv->signalCond, &vm->lock)); priv->jobSignalsData.statDevName = disk->info.alias; priv->jobSignalsData.blockStat = stats; priv->jobSignalsData.statRetCode = -1; priv->jobSignals |= QEMU_JOB_SIGNAL_BLKSTAT; while ((priv->jobResults & QEMU_JOB_SIGNAL_BLKSTAT) == 0) ignore_value(virCondWait(&priv->resultCond, &vm->lock)); ret = priv->jobSignalsData.statRetCode; priv->jobResults ^= QEMU_JOB_SIGNAL_BLKSTAT; virCondBroadcast(&priv->signalCond); ... priv->jobSignalsData.statRetCode = ret; priv->jobResults |= QEMU_JOB_SIGNAL_BLKSTAT; priv->jobSignals ^= QEMU_JOB_SIGNAL_BLKSTAT; virCondBroadcast(&priv->resultCond); ... while (priv->jobSignals & ~priv->jobResults) { if (qemuMigrationProcessJobSignals(driver, vm, job, false) < 0) goto cleanup; } ... cleanup: while (priv->jobSignals & ~priv->jobResults) { qemuMigrationProcessJobSignals(driver, vm, job, false); } -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list