On Wed, Jul 06, 2022 at 12:59:29PM +0100, Stefan Hajnoczi wrote: >On Fri, Jul 01, 2022 at 05:13:48PM +0800, Chao Gao wrote: >> When measuring FIO read performance (cache=writethrough, bs=4k, iodepth=64) >> in >> VMs, we observe ~80K/s notifications (e.g., EPT_MISCONFIG) from guest to >> qemu. > >It's not clear to me what caused the frequent poll_set_started(ctx, >false) calls and whether this patch is correct. I have posted some
Me either. That's why the patch was marked RFC. >questions below about the nature of this issue. > >If ctx->fdmon_ops->wait() is called while polling is still enabled then >hangs or unnecessary latency can occur. For example, consider an fd >handler that temporarily suppresses fd activity between poll start/end. >The thread would be blocked in ->wait() and the fd will never become >readable. Even if a hang doesn't occur because there is a timeout, there >would be extra latency because the fd doesn't become readable and we >have to wait for the timeout to expire so we can poll again. So we must >be sure it's safe to leave polling enabled across the event loop and I'm >not sure if this patch is okay. Thanks for the explanation. in aio_poll(), if (timeout || ctx->fdmon_ops->need_wait(ctx)) { ctx->fdmon_ops->wait(ctx, &ready_list, timeout); } if @timeout is zero, then ctx->fdmon_ops->wait() won't be called. In case #2 and #3 below, it is guaranteed that @timeout is zero after try_poll_mode() return. So, I think it is safe to keep polling enabled for these two cases. > >> >> Currently, poll_set_started(ctx,false) is called in try_poll_mode() to enable >> virtqueue notification in below 4 cases: >> >> 1. ctx->poll_ns is 0 >> 2. a zero timeout is passed to try_poll_mode() >> 3. polling succeeded but reported as no progress >> 4. polling failed and reported as no progress >> >> To minimize unnecessary guest notifications, keep notification disabled when >> it is possible, i.e., polling is enabled and last polling doesn't fail. > >What is the exact definition of polling success/failure? Polling success: found some events pending. Polling failure: timeout. success/failure are used because I saw below comment in run_poll_handlers_once(), then I thought they are well-known terms. /* * Polling was successful, exit try_poll_mode immediately * to adjust the next polling time. */ > >> >> Keep notification disabled for case #2 and #3; handle case #2 simply by a >> call > >Did you see case #2 happening often? What was the cause? I think so. I can add some tracepoint and collect statistics. IMO (of course, I can be totally wrong), the cause is: when a worker thread in thread poll completes an IO request, a bottom half is queued by work_thread()->qemu_bh_schedule(). Pending bottom halves lead to aio_compute_timeout() setting timeout to zero and then a zero timeout is passed to try_poll_mode(). > >> of run_poll_handlers_once() and for case #3, differentiate successful/failed >> polling and skip the call of poll_set_started(ctx,false) for successful ones. > >This is probably the most interesting case. When polling detects an >event, that's considered "progress", except for aio_notify() events. >aio_notify() is an internal event for restarting the event loop when >something has changed (e.g. fd handlers have been added/removed). I >wouldn't expect it to intefere polling frequently since that requires >another QEMU thread doing something to the AioContext, which should be >rare. > >Was aio_notify() intefering with polling in your case? Do you know why? Yes. It was. The reason is the same: after finishing IO requests, worker threads queue bottom halves during which aio_notify() is called.