On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote: > Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben: > > > > > > On 23/02/2016 13:49, Fam Zheng wrote: > > > On Tue, 02/23 11:43, Paolo Bonzini wrote: > > >> > > >> > > >> On 23/02/2016 06:57, Fam Zheng wrote: > > >>>>>> + qed_cancel_need_check_timer(s); > > >>>>>> + qed_need_check_timer_cb(s); > > >>>>>> + } > > >>>>> > > >>>>> What if an allocating write is queued (the else branch case)? Its > > >>>>> completion > > >>>>> will be in bdrv_drain and it could arm the need_check_timer which is > > >>>>> wrong. > > >>>>> > > >>>>> We need to drain the allocating_write_reqs queue before checking the > > >>>>> timer. > > >>>> > > >>>> You're right, but how? That's what bdrv_drain(bs) does, it's a > > >>>> chicken-and-egg problem. > > >>> > > >>> Maybe use an aio_poll loop before the if? > > >> > > >> That would not change the fact that you're reimplementing bdrv_drain > > >> inside bdrv_qed_drain. > > > > > > But it fulfills the contract of .bdrv_drain. This is the easy way, the > > > hard way > > > would be iterating through the allocating_write_reqs list and process > > > reqs one > > > by one synchronously, which still involves aio_poll indirectly. > > > > The easy way would be better then. > > > > Stefan, any second opinion? > > What's the status here? It would be good to have qed not segfaulting all > the time.
I think the timer concept itself is troublesome. A radical approach but limited to changing QED itself is to drop the timer and instead keep a timestamp for the last allocating write request. Next time a write request (allocating or rewriting) is about to complete, do the flush and clear the need check flag as part of the write request (if 5 seconds have passed since the timestamp). Because the need check flag clearing is part of the write request completion, it will integrate properly with drain. Stefan
signature.asc
Description: PGP signature