Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] iotests: fix test case 185

2018-04-03 Thread Stefan Hajnoczi
On Tue, Mar 27, 2018 at 11:32:00AM +0800, QingFeng Hao wrote:
> 
> 在 2018/3/23 18:04, Stefan Hajnoczi 写道:
> > On Fri, Mar 23, 2018 at 3:43 AM, QingFeng Hao  
> > wrote:
> > > Test case 185 failed since commit 4486e89c219 --- "vl: introduce 
> > > vm_shutdown()".
> > > It's because of the newly introduced function vm_shutdown calls 
> > > bdrv_drain_all,
> > > which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs
> > > that doubles the speed and offset is doubled.
> > > Some jobs' status are changed as well.
> > > 
> > > The fix is to not resume the jobs that are already yielded and also change
> > > 185.out accordingly.
> > > 
> > > Suggested-by: Stefan Hajnoczi 
> > > Signed-off-by: QingFeng Hao 
> > > ---
> > >   blockjob.c | 10 +-
> > >   include/block/blockjob.h   |  5 +
> > >   tests/qemu-iotests/185.out | 11 +--
> > 
> > If drain no longer forces the block job to iterate, shouldn't the test
> > output remain the same?  (The means the test is fixed by the QEMU
> > patch.)
> > 
> > >   3 files changed, 23 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/blockjob.c b/blockjob.c
> > > index ef3ed69ff1..fa9838ac97 100644
> > > --- a/blockjob.c
> > > +++ b/blockjob.c
> > > @@ -206,11 +206,16 @@ void block_job_txn_add_job(BlockJobTxn *txn, 
> > > BlockJob *job)
> > > 
> > >   static void block_job_pause(BlockJob *job)
> > >   {
> > > -job->pause_count++;
> > > +if (!job->yielded) {
> > > +job->pause_count++;
> > > +}
> > 
> > The pause cannot be ignored.  This change introduces a bug.
> > 
> > Pause is not a synchronous operation that stops the job immediately.
> > Pause just remembers that the job needs to be paused.   When the job
> > runs again (e.g. timer callback, fd handler) it eventually reaches
> > block_job_pause_point() where it really pauses.
> > 
> > The bug in this patch is:
> > 
> > 1. The job has a timer pending.
> > 2. block_job_pause() is called during drain.
> > 3. The timer fires during drain but now the job doesn't know it needs
> > to pause, so it continues running!
> > 
> > Instead what needs to happen is that block_job_pause() remains
> > unmodified but block_job_resume if extended:
> > 
> > static void block_job_resume(BlockJob *job)
> > {
> >  assert(job->pause_count > 0);
> >  job->pause_count--;
> >  if (job->pause_count) {
> >  return;
> >  }
> > +if (job_yielded_before_pause_and_is_still_yielded) {
> Thanks a lot for your great comments! But I can't figure out how to check
> this.
> >  block_job_enter(job);
> > +}
> > }
> > 
> > This handles the case I mentioned above, where the yield ends before
> > pause ends (therefore resume must enter the job!).
> > 
> > To make this a little clearer, there are two cases to consider:
> > 
> > Case 1:
> > 1. Job yields
> > 2. Pause
> > 3. Job is entered from timer/fd callback
> How do I know that if job is entered from ...? thanks

Sorry, in order to answer your question properly I would have to study
the code and get the point where I could write the patch myself.

I have sent a patch to update the test output for the upcoming QEMU 2.12
release.  At this time in the release cycle it's the most appropriate
solution.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] iotests: fix test case 185

2018-03-26 Thread QingFeng Hao



在 2018/3/26 18:29, Kevin Wolf 写道:

Am 23.03.2018 um 04:43 hat QingFeng Hao geschrieben:

Test case 185 failed since commit 4486e89c219 --- "vl: introduce vm_shutdown()".
It's because of the newly introduced function vm_shutdown calls bdrv_drain_all,
which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs
that doubles the speed and offset is doubled.
Some jobs' status are changed as well.

The fix is to not resume the jobs that are already yielded and also change
185.out accordingly.

Suggested-by: Stefan Hajnoczi 
Signed-off-by: QingFeng Hao 


Stefan already commented on the fix itself, but I want to add two more
points:

Please change your subject line. "iotests: fix test case 185" means that
you are fixing the test case, not qemu code that makes the test case
fail. The subject line should describe the actual change. In all
likelihood it will start with "blockjob:" rather than "iotests:".

Sure! thanks for pointing that.



diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index fc645dac68..f8f208bbcf 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -99,6 +99,11 @@ typedef struct BlockJob {
  bool ready;
  
  /**

+ * Set to true when the job is yielded.
+ */
+bool yielded;


This is the same as !busy, so we don't need a new field for this.

Mostly yes, but the trick is that busy is set to be true in 
block_job_do_yield.

Kevin



--
Regards
QingFeng Hao




Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] iotests: fix test case 185

2018-03-26 Thread QingFeng Hao


在 2018/3/23 18:04, Stefan Hajnoczi 写道:

On Fri, Mar 23, 2018 at 3:43 AM, QingFeng Hao  wrote:

Test case 185 failed since commit 4486e89c219 --- "vl: introduce vm_shutdown()".
It's because of the newly introduced function vm_shutdown calls bdrv_drain_all,
which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs
that doubles the speed and offset is doubled.
Some jobs' status are changed as well.

The fix is to not resume the jobs that are already yielded and also change
185.out accordingly.

Suggested-by: Stefan Hajnoczi 
Signed-off-by: QingFeng Hao 
---
  blockjob.c | 10 +-
  include/block/blockjob.h   |  5 +
  tests/qemu-iotests/185.out | 11 +--


If drain no longer forces the block job to iterate, shouldn't the test
output remain the same?  (The means the test is fixed by the QEMU
patch.)


  3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index ef3ed69ff1..fa9838ac97 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -206,11 +206,16 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob 
*job)

  static void block_job_pause(BlockJob *job)
  {
-job->pause_count++;
+if (!job->yielded) {
+job->pause_count++;
+}


The pause cannot be ignored.  This change introduces a bug.

Pause is not a synchronous operation that stops the job immediately.
Pause just remembers that the job needs to be paused.   When the job
runs again (e.g. timer callback, fd handler) it eventually reaches
block_job_pause_point() where it really pauses.

The bug in this patch is:

1. The job has a timer pending.
2. block_job_pause() is called during drain.
3. The timer fires during drain but now the job doesn't know it needs
to pause, so it continues running!

Instead what needs to happen is that block_job_pause() remains
unmodified but block_job_resume if extended:

static void block_job_resume(BlockJob *job)
{
 assert(job->pause_count > 0);
 job->pause_count--;
 if (job->pause_count) {
 return;
 }
+if (job_yielded_before_pause_and_is_still_yielded) {
Thanks a lot for your great comments! But I can't figure out how to 
check this.

 block_job_enter(job);
+}
}

This handles the case I mentioned above, where the yield ends before
pause ends (therefore resume must enter the job!).

To make this a little clearer, there are two cases to consider:

Case 1:
1. Job yields
2. Pause
3. Job is entered from timer/fd callback

How do I know that if job is entered from ...? thanks

4. Resume (enter job? yes)

Case 2:
1. Job yields
2. Pause
3. Resume (enter job? no)
4. Job is entered from timer/fd callback

Stefan



--
Regards
QingFeng Hao




Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] iotests: fix test case 185

2018-03-23 Thread Stefan Hajnoczi
On Fri, Mar 23, 2018 at 3:43 AM, QingFeng Hao  wrote:
> Test case 185 failed since commit 4486e89c219 --- "vl: introduce 
> vm_shutdown()".
> It's because of the newly introduced function vm_shutdown calls 
> bdrv_drain_all,
> which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs
> that doubles the speed and offset is doubled.
> Some jobs' status are changed as well.
>
> The fix is to not resume the jobs that are already yielded and also change
> 185.out accordingly.
>
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: QingFeng Hao 
> ---
>  blockjob.c | 10 +-
>  include/block/blockjob.h   |  5 +
>  tests/qemu-iotests/185.out | 11 +--

If drain no longer forces the block job to iterate, shouldn't the test
output remain the same?  (The means the test is fixed by the QEMU
patch.)

>  3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/blockjob.c b/blockjob.c
> index ef3ed69ff1..fa9838ac97 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -206,11 +206,16 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob 
> *job)
>
>  static void block_job_pause(BlockJob *job)
>  {
> -job->pause_count++;
> +if (!job->yielded) {
> +job->pause_count++;
> +}

The pause cannot be ignored.  This change introduces a bug.

Pause is not a synchronous operation that stops the job immediately.
Pause just remembers that the job needs to be paused.   When the job
runs again (e.g. timer callback, fd handler) it eventually reaches
block_job_pause_point() where it really pauses.

The bug in this patch is:

1. The job has a timer pending.
2. block_job_pause() is called during drain.
3. The timer fires during drain but now the job doesn't know it needs
to pause, so it continues running!

Instead what needs to happen is that block_job_pause() remains
unmodified but block_job_resume if extended:

static void block_job_resume(BlockJob *job)
{
assert(job->pause_count > 0);
job->pause_count--;
if (job->pause_count) {
return;
}
+if (job_yielded_before_pause_and_is_still_yielded) {
block_job_enter(job);
+}
}

This handles the case I mentioned above, where the yield ends before
pause ends (therefore resume must enter the job!).

To make this a little clearer, there are two cases to consider:

Case 1:
1. Job yields
2. Pause
3. Job is entered from timer/fd callback
4. Resume (enter job? yes)

Case 2:
1. Job yields
2. Pause
3. Resume (enter job? no)
4. Job is entered from timer/fd callback

Stefan