cancellation

Paolo Bonzini Fri, 09 Sep 2011 06:12:36 -0700

On 09/09/2011 02:59 PM, Kevin Wolf wrote:

Also, I think it should be -EIO instead of -ENOMEM (even though it
doesn't make any difference if we don't call the callback)


If I understood the code correctly, dbs->io_func can only fail if it
fails to get an AIOCB, which is basically out-of-memory.


Yeah, maybe you're right with the error code. Anyway, should we call the
callback?

Considering that out-of-memory cannot happen and a couple of drivers doreturn NULL, you're right about going for EIO and calling the callback.

I think it would make sense to require block drivers to return a valid
ACB (qemu_aio_get never returns NULL). If they have an error to report
they should schedule a BH that calls the callback.

Perhaps you can write it down on the Wiki? There is already a blockdriver braindump page, right?

Did you consider that there are block drivers that implement
bdrv_aio_cancel() as waiting for completion of outstanding requests? I
think in that case dma_complete() may be called twice. For most of it,
this shouldn't be a problem, but I think it doesn't work with the
qemu_aio_release(dbs).


Right.  But then what to do (short of inventing reference counting
of some sort for AIOCBs) with those that don't?  Leaking should not
be acceptable, should it?


Hm, not sure. This whole cancellation stuff is so broken...

Maybe we should really refcount dbs (actually it would be more like a
bool in_cancel that means that dma_complete doesn't release the AIOCB)

But then it would leak for the drivers that do not wait for completion?The problem is that the caller specifies what you should do but you donot know it.

In fact it may be worse than just the qemu_aio_release: if the driver iswaiting for the request to complete, it will write over the bouncebuffer after dma_bdrv_unmap has been called.

In other words, I don't think this is fixable at all without referencecounting _all_ AIOCBs.

In short, this patch can be dropped, but it shows more problems. :)


I'd rather have it fixed than dropped :-)

Me too. However, I'm looking at this because SCSI reset cancels allpending I/O, and I would like to do a SCSI reset (and report unitattention) just before migration, as an easy way to avoid saving thepending requests in the migration stream. But if it's just leaking theiovec on the migration source, I really do not care much, not yet atleast, because it's going to exit anyway. The other problems onlyhappen with DMA over I/O areas, which virtio-scsi won't do. I don'tthink it's really fixable, so I quit. :)


Paolo

Re: [Qemu-devel] [PATCH 3/5] dma-helpers: rewrite completion/cancellation

Reply via email to