On 16/11/2017 6:54 PM, Alberto Garcia wrote:
On Wed 15 Nov 2017 05:31:20 PM CET, Anton Nefedov wrote:
I have the impression that one major source of headaches is the fact
that the reopen queue contains nodes that don't need to be reopened at
all. Ideally this should be detected early on in bdrv_reopen_queue(), so
there's no chance that the queue contains nodes used by a different
block job. If we had that then op blockers should be enough to prevent
these things. Or am I missing something?

After applying Max's patch I tried the similar approach; that is keep
BDSes referenced while they are in the reopen queue.  Now I get the
stream job hanging. Somehow one blk_root_drained_begin() is not paired
with blk_root_drained_end(). So the job stays paused.

I can see this if I apply Max's patch and keep refs to BDSs in the
reopen queue:

#0  block_job_pause (...) at blockjob.c:130
#1  0x000055c143cb586d in block_job_drained_begin (...) at blockjob.c:227
#2  0x000055c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887
#3  0x000055c143cb69db in block_job_create (...) at blockjob.c:678
#4  0x000055c143d17c0c in mirror_start_job (...) at block/mirror.c:1177

There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that
doesn't seem to be paired.

My understanding for now is

  1. in bdrv_drain_all_begin(), BDS gets drained with
     bdrv_parent_drained_begin(), all the parents get
     blk_root_drained_begin(), pause their jobs.
  2. one of the jobs finishes and deletes BB.
  3. in bdrv_drain_all_end(), bdrv_parent_drained_end() is never
     called because even though BDS still exists (referenced in the
     queue), it cannot be accessed as bdrv_next() takes BDS from the
     global BB list (and the corresponding BB is deleted).

Max's patch v1 could have helped:
http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg01835.html

Or, perhaps another approach, keep BlockJob referenced while it is
paused (by block_job_pause/resume_all()). That should prevent it from
deleting the BB.


And when you call block_job_start() then it
yields immediately waiting for the resume (that never arrives).

John, this change was yours (f4d9cc88ee69a5b04). Any idea?

Berto


Reply via email to