Re: [Qemu-block] [PATCH] qed: fix bdrv_qed_drain

Stefan Hajnoczi Wed, 09 Mar 2016 07:46:19 -0800

On Tue, Mar 08, 2016 at 10:58:47AM +0100, Kevin Wolf wrote:
> Am 07.03.2016 um 21:56 hat Stefan Hajnoczi geschrieben:
> > On Mon, Mar 07, 2016 at 05:57:41PM +0100, Kevin Wolf wrote:
> > > Am 23.02.2016 um 14:54 hat Paolo Bonzini geschrieben:
> > > > 
> > > > 
> > > > On 23/02/2016 13:49, Fam Zheng wrote:
> > > > > On Tue, 02/23 11:43, Paolo Bonzini wrote:
> > > > >>
> > > > >>
> > > > >> On 23/02/2016 06:57, Fam Zheng wrote:
> > > > >>>>>> +        qed_cancel_need_check_timer(s);
> > > > >>>>>> +        qed_need_check_timer_cb(s);
> > > > >>>>>> +    }
> > > > >>>>>
> > > > >>>>> What if an allocating write is queued (the else branch case)? Its 
> > > > >>>>> completion
> > > > >>>>> will be in bdrv_drain and it could arm the need_check_timer which 
> > > > >>>>> is wrong.
> > > > >>>>>
> > > > >>>>> We need to drain the allocating_write_reqs queue before checking 
> > > > >>>>> the timer.
> > > > >>>>
> > > > >>>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> > > > >>>> chicken-and-egg problem.
> > > > >>>
> > > > >>> Maybe use an aio_poll loop before the if?
> > > > >>
> > > > >> That would not change the fact that you're reimplementing bdrv_drain
> > > > >> inside bdrv_qed_drain.
> > > > > 
> > > > > But it fulfills the contract of .bdrv_drain. This is the easy way, 
> > > > > the hard way
> > > > > would be iterating through the allocating_write_reqs list and process 
> > > > > reqs one
> > > > > by one synchronously, which still involves aio_poll indirectly.
> > > > 
> > > > The easy way would be better then.
> > > > 
> > > > Stefan, any second opinion?
> > > 
> > > What's the status here? It would be good to have qed not segfaulting all
> > > the time.
> > 
> > I think the timer concept itself is troublesome.  A radical approach but
> > limited to changing QED itself is to drop the timer and instead keep a
> > timestamp for the last allocating write request.  Next time a write
> > request (allocating or rewriting) is about to complete, do the flush and
> > clear the need check flag as part of the write request (if 5 seconds
> > have passed since the timestamp).
> 
> Isn't that kind of backwards, though, because it's very likely that
> after some new activity a second write request will come in soon and
> mark the image dirty again?
> 
> Apart from that, doesn't it fail to provide the intended effect, that
> the image doesn't stay dirty for a long time and a repair isn't usually
> needed after a crash?


Yes, it relies on a non-allocating write request after the timeout.

I've reverted the drain patch for now.

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-block] [PATCH] qed: fix bdrv_qed_drain

Reply via email to