Am 20.09.2019 um 20:55 hat Tony Asleson geschrieben: > On 9/20/19 1:11 PM, Kevin Wolf wrote: > > Emitting a QMP event when blkdebug injects an error makes sense to me. > > > > I wouldn't use it for this case, though, because this would become racy. > > It could happen that the guest writes to the image, which sends a QMP > > event, and then reads before the external program has removed the error. > > My POC had a single lock protecting it's shared state. I'm kind of > surprised no one jumped on that because it's a big point of lock > contention.
I think people didn't review the code in detail because we're still discussing very high-level design questions. Anyway, I did mention that I'd like to get your code out of the way for the fast path when the feature isn't used. If the user explicitly enabled the feature and we're basically in a debugging setup, the lock contention should be acceptable. In fact, the mutex might not even be necessary because the code should be covered by the AioContext lock. However, I don't see how this locking could fix the race I mention. It's not a race between two QEMU components, but between the guest and a QMP client. A mutex in QEMU certainly feels like the wrong way to address it. If you really wanted an external process to control this, you would have to fully stop the VM whenever an error is injected and only continue it via QMP after the QMP client has decided whether or not to disable the error. Because you'd need a custom QMP client then, you wouldn't be able to use things like libvirt for such QEMU instances. Kevin