On Tue, Mar 27, 2018 at 11:32:00AM +0800, QingFeng Hao wrote: > > 在 2018/3/23 18:04, Stefan Hajnoczi 写道: > > On Fri, Mar 23, 2018 at 3:43 AM, QingFeng Hao <ha...@linux.vnet.ibm.com> > > wrote: > > > Test case 185 failed since commit 4486e89c219 --- "vl: introduce > > > vm_shutdown()". > > > It's because of the newly introduced function vm_shutdown calls > > > bdrv_drain_all, > > > which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs > > > that doubles the speed and offset is doubled. > > > Some jobs' status are changed as well. > > > > > > The fix is to not resume the jobs that are already yielded and also change > > > 185.out accordingly. > > > > > > Suggested-by: Stefan Hajnoczi <stefa...@gmail.com> > > > Signed-off-by: QingFeng Hao <ha...@linux.vnet.ibm.com> > > > --- > > > blockjob.c | 10 +++++++++- > > > include/block/blockjob.h | 5 +++++ > > > tests/qemu-iotests/185.out | 11 +++++++++-- > > > > If drain no longer forces the block job to iterate, shouldn't the test > > output remain the same? (The means the test is fixed by the QEMU > > patch.) > > > > > 3 files changed, 23 insertions(+), 3 deletions(-) > > > > > > diff --git a/blockjob.c b/blockjob.c > > > index ef3ed69ff1..fa9838ac97 100644 > > > --- a/blockjob.c > > > +++ b/blockjob.c > > > @@ -206,11 +206,16 @@ void block_job_txn_add_job(BlockJobTxn *txn, > > > BlockJob *job) > > > > > > static void block_job_pause(BlockJob *job) > > > { > > > - job->pause_count++; > > > + if (!job->yielded) { > > > + job->pause_count++; > > > + } > > > > The pause cannot be ignored. This change introduces a bug. > > > > Pause is not a synchronous operation that stops the job immediately. > > Pause just remembers that the job needs to be paused. When the job > > runs again (e.g. timer callback, fd handler) it eventually reaches > > block_job_pause_point() where it really pauses. > > > > The bug in this patch is: > > > > 1. The job has a timer pending. > > 2. block_job_pause() is called during drain. > > 3. The timer fires during drain but now the job doesn't know it needs > > to pause, so it continues running! > > > > Instead what needs to happen is that block_job_pause() remains > > unmodified but block_job_resume if extended: > > > > static void block_job_resume(BlockJob *job) > > { > > assert(job->pause_count > 0); > > job->pause_count--; > > if (job->pause_count) { > > return; > > } > > + if (job_yielded_before_pause_and_is_still_yielded) { > Thanks a lot for your great comments! But I can't figure out how to check > this. > > block_job_enter(job); > > + } > > } > > > > This handles the case I mentioned above, where the yield ends before > > pause ends (therefore resume must enter the job!). > > > > To make this a little clearer, there are two cases to consider: > > > > Case 1: > > 1. Job yields > > 2. Pause > > 3. Job is entered from timer/fd callback > How do I know that if job is entered from ...? thanks
Sorry, in order to answer your question properly I would have to study the code and get the point where I could write the patch myself. I have sent a patch to update the test output for the upcoming QEMU 2.12 release. At this time in the release cycle it's the most appropriate solution. Stefan
signature.asc
Description: PGP signature