Hi Stefan,

On 2020/8/12 21:51, Stefan Hajnoczi wrote:
> On Mon, Aug 10, 2020 at 10:52:44PM +0800, Zhenyu Ye wrote:
>> Before doing qmp actions, we need to lock the qemu_global_mutex,
>> so the qmp actions should not take too long time.
>>
>> Unfortunately, some qmp actions need to acquire aio context and
>> this may take a long time.  The vm will soft lockup if this time
>> is too long.
>>
>> So add a timeout mechanism while doing qmp actions.
> 
> aio_context_acquire_timeout() is a workaround for code that blocks the
> event loop. Ideally there should be no code that blocks the event loop.
> 
> Which cases have you found where the event loop is blocked?
> 

Currently I only found the io_submit() will block while I/O pressure
is too high, for details, see:

        
https://lore.kernel.org/qemu-devel/c6d75e49-3e36-6a76-fdc8-cdf09e7c3...@huawei.com/

io_submit can not ensure non-blocking at any time.

> I think they should be investigated and fixed (if possible) before
> introducing an API like aio_context_acquire_timeout().
> 

We cannot ensure that everything is non-blocking in iothread, because
some actions seems like asynchronous but will block in some times (such
as io_submit).  Anyway, the _timeout() API can make these qmp commands
(which need to get aio_context) be safer.

Thanks,
Zhenyu


Reply via email to