Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-03-07 Thread Paolo Bonzini


On 07/03/2016 21:56, Stefan Hajnoczi wrote:
> I think the timer concept itself is troublesome.  A radical approach but
> limited to changing QED itself is to drop the timer and instead keep a
> timestamp for the last allocating write request.  Next time a write
> request (allocating or rewriting) is about to complete, do the flush and
> clear the need check flag as part of the write request (if 5 seconds
> have passed since the timestamp).

bdrv_qed_drain should be easy to fix in my new drain implementation,
which is based on draining the parent before the BdrvChild-ren.  It's
just troublesome in the current one which alternates flushing and draining.

I would just revert the patch that introduced bdrv_qed_drain for now,
and reintroduce it later (note however that something was messed up in
commit df9a681, "qed: Implement .bdrv_drain", 2015-11-12, because it
includes some dirty bitmap stuff).

Or perhaps the idea of making QED read-only has some merit...

Paolo



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-23 Thread Paolo Bonzini


On 23/02/2016 13:49, Fam Zheng wrote:
> On Tue, 02/23 11:43, Paolo Bonzini wrote:
>>
>>
>> On 23/02/2016 06:57, Fam Zheng wrote:
>> +qed_cancel_need_check_timer(s);
>> +qed_need_check_timer_cb(s);
>> +}
>
> What if an allocating write is queued (the else branch case)? Its 
> completion
> will be in bdrv_drain and it could arm the need_check_timer which is 
> wrong.
>
> We need to drain the allocating_write_reqs queue before checking the 
> timer.

 You're right, but how?  That's what bdrv_drain(bs) does, it's a
 chicken-and-egg problem.
>>>
>>> Maybe use an aio_poll loop before the if?
>>
>> That would not change the fact that you're reimplementing bdrv_drain
>> inside bdrv_qed_drain.
> 
> But it fulfills the contract of .bdrv_drain. This is the easy way, the hard 
> way
> would be iterating through the allocating_write_reqs list and process reqs one
> by one synchronously, which still involves aio_poll indirectly.

The easy way would be better then.

Stefan, any second opinion?

Paolo

>> Perhaps for now it's simplest to just remove the QED .bdrv_drain
>> callback, if you think this patch is not a good stopgap measure to avoid
>> the segmentation faults.
> 
> OK, I'm fine with this as a stopgap measure.
> 
>> Once the bdrv_drain rework is in, we can move the callback _after_ I/O
>> is drained on bs and before it is drained on bs->file->bs.
> 
> Sounds good.



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-23 Thread Fam Zheng
On Tue, 02/23 11:43, Paolo Bonzini wrote:
> 
> 
> On 23/02/2016 06:57, Fam Zheng wrote:
>  +qed_cancel_need_check_timer(s);
>  +qed_need_check_timer_cb(s);
>  +}
> >>>
> >>> What if an allocating write is queued (the else branch case)? Its 
> >>> completion
> >>> will be in bdrv_drain and it could arm the need_check_timer which is 
> >>> wrong.
> >>>
> >>> We need to drain the allocating_write_reqs queue before checking the 
> >>> timer.
> >>
> >> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> >> chicken-and-egg problem.
> > 
> > Maybe use an aio_poll loop before the if?
> 
> That would not change the fact that you're reimplementing bdrv_drain
> inside bdrv_qed_drain.
> 

But it fulfills the contract of .bdrv_drain. This is the easy way, the hard way
would be iterating through the allocating_write_reqs list and process reqs one
by one synchronously, which still involves aio_poll indirectly.

> Perhaps for now it's simplest to just remove the QED .bdrv_drain
> callback, if you think this patch is not a good stopgap measure to avoid
> the segmentation faults.

OK, I'm fine with this as a stopgap measure.

> 
> Once the bdrv_drain rework is in, we can move the callback _after_ I/O
> is drained on bs and before it is drained on bs->file->bs.

Sounds good.

Fam



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-23 Thread Paolo Bonzini


On 23/02/2016 06:57, Fam Zheng wrote:
 +qed_cancel_need_check_timer(s);
 +qed_need_check_timer_cb(s);
 +}
>>>
>>> What if an allocating write is queued (the else branch case)? Its completion
>>> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
>>>
>>> We need to drain the allocating_write_reqs queue before checking the timer.
>>
>> You're right, but how?  That's what bdrv_drain(bs) does, it's a
>> chicken-and-egg problem.
> 
> Maybe use an aio_poll loop before the if?

That would not change the fact that you're reimplementing bdrv_drain
inside bdrv_qed_drain.

Perhaps for now it's simplest to just remove the QED .bdrv_drain
callback, if you think this patch is not a good stopgap measure to avoid
the segmentation faults.

Once the bdrv_drain rework is in, we can move the callback _after_ I/O
is drained on bs and before it is drained on bs->file->bs.

Paolo



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-22 Thread Fam Zheng
On Wed, 02/17 12:28, Paolo Bonzini wrote:
> 
> 
> On 17/02/2016 03:57, Fam Zheng wrote:
> > On Tue, 02/16 16:53, Paolo Bonzini wrote:
> >> The current implementation of bdrv_qed_drain can cause a double
> >> completion of a request.
> >>
> >> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
> >> unconditionally, but this is not correct if an allocating write
> >> is queued.  In this case, qed_unplug_allocating_write_reqs will
> >> restart the allocating write and possibly cause it to complete.
> >> The aiocb however is still in use for the L2/L1 table writes,
> >> and will then be completed again as soon as the table writes
> >> are stable.
> >>
> >> The fix is to only call qed_plug_allocating_write_reqs and
> >> bdrv_aio_flush (which is the same as the timer callback---the patch
> >> makes this explicit) only if the timer was scheduled in the first
> >> place.  This fixes qemu-iotests test 011.
> >>
> >> Cc: qemu-sta...@nongnu.org
> >> Cc: qemu-bl...@nongnu.org
> >> Cc: Stefan Hajnoczi 
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>  block/qed.c | 13 +++--
> >>  1 file changed, 7 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/block/qed.c b/block/qed.c
> >> index 404be1e..ebba220 100644
> >> --- a/block/qed.c
> >> +++ b/block/qed.c
> >> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
> >>  {
> >>  BDRVQEDState *s = bs->opaque;
> >>  
> >> -/* Cancel timer and start doing I/O that were meant to happen as if it
> >> - * fired, that way we get bdrv_drain() taking care of the ongoing 
> >> requests
> >> - * correctly. */
> >> -qed_cancel_need_check_timer(s);
> >> -qed_plug_allocating_write_reqs(s);
> >> -bdrv_aio_flush(s->bs, qed_clear_need_check, s);
> >> +/* Fire the timer immediately in order to start doing I/O as soon as 
> >> the
> >> + * header is flushed.
> >> + */
> >> +if (s->need_check_timer && timer_pending(s->need_check_timer)) {
> > 
> > We can assert(s->need_check_timer);
> 
> I've seen it NULL, but didn't check why.  This was also a source of
> segmentation faults.
> 
> >> +qed_cancel_need_check_timer(s);
> >> +qed_need_check_timer_cb(s);
> >> +}
> > 
> > What if an allocating write is queued (the else branch case)? Its completion
> > will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> > 
> > We need to drain the allocating_write_reqs queue before checking the timer.
> 
> You're right, but how?  That's what bdrv_drain(bs) does, it's a
> chicken-and-egg problem.

Maybe use an aio_poll loop before the if?

Fam



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-17 Thread Paolo Bonzini


On 17/02/2016 03:57, Fam Zheng wrote:
> On Tue, 02/16 16:53, Paolo Bonzini wrote:
>> The current implementation of bdrv_qed_drain can cause a double
>> completion of a request.
>>
>> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
>> unconditionally, but this is not correct if an allocating write
>> is queued.  In this case, qed_unplug_allocating_write_reqs will
>> restart the allocating write and possibly cause it to complete.
>> The aiocb however is still in use for the L2/L1 table writes,
>> and will then be completed again as soon as the table writes
>> are stable.
>>
>> The fix is to only call qed_plug_allocating_write_reqs and
>> bdrv_aio_flush (which is the same as the timer callback---the patch
>> makes this explicit) only if the timer was scheduled in the first
>> place.  This fixes qemu-iotests test 011.
>>
>> Cc: qemu-sta...@nongnu.org
>> Cc: qemu-bl...@nongnu.org
>> Cc: Stefan Hajnoczi 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  block/qed.c | 13 +++--
>>  1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/qed.c b/block/qed.c
>> index 404be1e..ebba220 100644
>> --- a/block/qed.c
>> +++ b/block/qed.c
>> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>>  {
>>  BDRVQEDState *s = bs->opaque;
>>  
>> -/* Cancel timer and start doing I/O that were meant to happen as if it
>> - * fired, that way we get bdrv_drain() taking care of the ongoing 
>> requests
>> - * correctly. */
>> -qed_cancel_need_check_timer(s);
>> -qed_plug_allocating_write_reqs(s);
>> -bdrv_aio_flush(s->bs, qed_clear_need_check, s);
>> +/* Fire the timer immediately in order to start doing I/O as soon as the
>> + * header is flushed.
>> + */
>> +if (s->need_check_timer && timer_pending(s->need_check_timer)) {
> 
> We can assert(s->need_check_timer);

I've seen it NULL, but didn't check why.  This was also a source of
segmentation faults.

>> +qed_cancel_need_check_timer(s);
>> +qed_need_check_timer_cb(s);
>> +}
> 
> What if an allocating write is queued (the else branch case)? Its completion
> will be in bdrv_drain and it could arm the need_check_timer which is wrong.
> 
> We need to drain the allocating_write_reqs queue before checking the timer.

You're right, but how?  That's what bdrv_drain(bs) does, it's a
chicken-and-egg problem.

In my new series, draining works separately on each BlockDriverState
along the chain, from parent (bs) to children (bs->file->bs).  We could
then have .before_drain and .after_drain; .before_drain disables the
timer, while .after_drain does the same operation as the timer.  But
that couldn't be backported, and 2.5 would remain broken forever.  This
patch at least is a band-aid to make QED functional.

Perhaps the alternative is to remove write support for QED altogether
(and the drain callback with it, since it's the only user).  Not sure
why anyone would use it these days.

Paolo



Re: [Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-16 Thread Fam Zheng
On Tue, 02/16 16:53, Paolo Bonzini wrote:
> The current implementation of bdrv_qed_drain can cause a double
> completion of a request.
> 
> The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
> unconditionally, but this is not correct if an allocating write
> is queued.  In this case, qed_unplug_allocating_write_reqs will
> restart the allocating write and possibly cause it to complete.
> The aiocb however is still in use for the L2/L1 table writes,
> and will then be completed again as soon as the table writes
> are stable.
> 
> The fix is to only call qed_plug_allocating_write_reqs and
> bdrv_aio_flush (which is the same as the timer callback---the patch
> makes this explicit) only if the timer was scheduled in the first
> place.  This fixes qemu-iotests test 011.
> 
> Cc: qemu-sta...@nongnu.org
> Cc: qemu-bl...@nongnu.org
> Cc: Stefan Hajnoczi 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/qed.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/block/qed.c b/block/qed.c
> index 404be1e..ebba220 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>  {
>  BDRVQEDState *s = bs->opaque;
>  
> -/* Cancel timer and start doing I/O that were meant to happen as if it
> - * fired, that way we get bdrv_drain() taking care of the ongoing 
> requests
> - * correctly. */
> -qed_cancel_need_check_timer(s);
> -qed_plug_allocating_write_reqs(s);
> -bdrv_aio_flush(s->bs, qed_clear_need_check, s);
> +/* Fire the timer immediately in order to start doing I/O as soon as the
> + * header is flushed.
> + */
> +if (s->need_check_timer && timer_pending(s->need_check_timer)) {

We can assert(s->need_check_timer);

> +qed_cancel_need_check_timer(s);
> +qed_need_check_timer_cb(s);
> +}

What if an allocating write is queued (the else branch case)? Its completion
will be in bdrv_drain and it could arm the need_check_timer which is wrong.

We need to drain the allocating_write_reqs queue before checking the timer.

Fam



[Qemu-devel] [PATCH] qed: fix bdrv_qed_drain

2016-02-16 Thread Paolo Bonzini
The current implementation of bdrv_qed_drain can cause a double
completion of a request.

The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
unconditionally, but this is not correct if an allocating write
is queued.  In this case, qed_unplug_allocating_write_reqs will
restart the allocating write and possibly cause it to complete.
The aiocb however is still in use for the L2/L1 table writes,
and will then be completed again as soon as the table writes
are stable.

The fix is to only call qed_plug_allocating_write_reqs and
bdrv_aio_flush (which is the same as the timer callback---the patch
makes this explicit) only if the timer was scheduled in the first
place.  This fixes qemu-iotests test 011.

Cc: qemu-sta...@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 block/qed.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 404be1e..ebba220 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -380,12 +380,13 @@ static void bdrv_qed_drain(BlockDriverState *bs)
 {
 BDRVQEDState *s = bs->opaque;
 
-/* Cancel timer and start doing I/O that were meant to happen as if it
- * fired, that way we get bdrv_drain() taking care of the ongoing requests
- * correctly. */
-qed_cancel_need_check_timer(s);
-qed_plug_allocating_write_reqs(s);
-bdrv_aio_flush(s->bs, qed_clear_need_check, s);
+/* Fire the timer immediately in order to start doing I/O as soon as the
+ * header is flushed.
+ */
+if (s->need_check_timer && timer_pending(s->need_check_timer)) {
+qed_cancel_need_check_timer(s);
+qed_need_check_timer_cb(s);
+}
 }
 
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
-- 
2.5.0