Re: [RFC 0/4] POC: Generating realistic block errors

Kevin Wolf Mon, 30 Sep 2019 07:55:41 -0700

Am 20.09.2019 um 20:55 hat Tony Asleson geschrieben:
> On 9/20/19 1:11 PM, Kevin Wolf wrote:
> > Emitting a QMP event when blkdebug injects an error makes sense to me.
> > 
> > I wouldn't use it for this case, though, because this would become racy.
> > It could happen that the guest writes to the image, which sends a QMP
> > event, and then reads before the external program has removed the error.
> 
> My POC had a single lock protecting it's shared state.  I'm kind of
> surprised no one jumped on that because it's a big point of lock
> contention.


I think people didn't review the code in detail because we're still
discussing very high-level design questions.

Anyway, I did mention that I'd like to get your code out of the way for
the fast path when the feature isn't used. If the user explicitly
enabled the feature and we're basically in a debugging setup, the lock
contention should be acceptable. In fact, the mutex might not even be
necessary because the code should be covered by the AioContext lock.

However, I don't see how this locking could fix the race I mention. It's
not a race between two QEMU components, but between the guest and a QMP
client. A mutex in QEMU certainly feels like the wrong way to address
it.

If you really wanted an external process to control this, you would have
to fully stop the VM whenever an error is injected and only continue it
via QMP after the QMP client has decided whether or not to disable the
error. Because you'd need a custom QMP client then, you wouldn't be able
to use things like libvirt for such QEMU instances.

Kevin

Re: [RFC 0/4] POC: Generating realistic block errors

Reply via email to