cancellation

Kevin Wolf Fri, 09 Sep 2011 06:31:46 -0700

Am 09.09.2011 15:12, schrieb Paolo Bonzini:
> On 09/09/2011 02:59 PM, Kevin Wolf wrote:
>>>> Also, I think it should be -EIO instead of -ENOMEM (even though it
>>>> doesn't make any difference if we don't call the callback)
>>>
>>> If I understood the code correctly, dbs->io_func can only fail if it
>>> fails to get an AIOCB, which is basically out-of-memory.
>>
>> Yeah, maybe you're right with the error code. Anyway, should we call the
>> callback?
> 
> Considering that out-of-memory cannot happen and a couple of drivers do 
> return NULL, you're right about going for EIO and calling the callback.
> 
>> I think it would make sense to require block drivers to return a valid
>> ACB (qemu_aio_get never returns NULL). If they have an error to report
>> they should schedule a BH that calls the callback.
> 
> Perhaps you can write it down on the Wiki?  There is already a block 
> driver braindump page, right?


http://wiki.qemu.org/BlockRoadmap

This one? Adding it there now.

>>>> Did you consider that there are block drivers that implement
>>>> bdrv_aio_cancel() as waiting for completion of outstanding requests? I
>>>> think in that case dma_complete() may be called twice. For most of it,
>>>> this shouldn't be a problem, but I think it doesn't work with the
>>>> qemu_aio_release(dbs).
>>>
>>> Right.  But then what to do (short of inventing reference counting
>>> of some sort for AIOCBs) with those that don't?  Leaking should not
>>> be acceptable, should it?
>>
>> Hm, not sure. This whole cancellation stuff is so broken...
>>
>> Maybe we should really refcount dbs (actually it would be more like a
>> bool in_cancel that means that dma_complete doesn't release the AIOCB)
> 
> But then it would leak for the drivers that do not wait for completion? 
>   The problem is that the caller specifies what you should do but you do 
> not know it.

Why would it leak? To clarify, what I'm thinking of is:

static void dma_aio_cancel(BlockDriverAIOCB *acb)
{
    DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);

    if (dbs->acb) {
        BlockDriverAIOCB *acb = dbs->acb;
        dbs->acb = NULL;
        dbs->in_cancel = true;
        bdrv_aio_cancel(acb);
        dbs->in_cancel = false;
    }
    dbs->common.cb = NULL;
    dma_complete(dbs, 0);
 }

And then in dma_complete:

    ...
    if (!dbs->in_cancel) {
        qemu_aio_release(dbs);
    }
}

So the release that we avoid is the release in the callback that may or
may not be called indirectly by bdrv_aio_cancel. We always call
dma_complete at the end of dma_aio_cancel so that it will be properly freed.

> In fact it may be worse than just the qemu_aio_release: if the driver is 
> waiting for the request to complete, it will write over the bounce 
> buffer after dma_bdrv_unmap has been called.

How that? dma_bdrv_unmap is called only afterwards, isn't it?

Kevin

Re: [Qemu-devel] [PATCH 3/5] dma-helpers: rewrite completion/cancellation

Reply via email to