Am 30.11.2017 um 15:35 hat Alberto Garcia geschrieben:
> On Thu 30 Nov 2017 01:27:32 PM CET, Kevin Wolf wrote:
> 
> >> Destroying a paused block job during bdrv_reopen_multiple() has two
> >> consequences:
> >> 
> >>    1) The references to the nodes involved in the job are released,
> >>       possibly destroying some of them. If those nodes were in the
> >>       reopen queue this would trigger the problem originally described
> >>       in commit 40840e419be, crashing QEMU.
> >
> > This specific problem could be avoided by making the BDS reference in
> > the reopen queue strong, i.e. bdrv_ref() in bdrv_reopen_queue_child()
> > and bdrv_unref() only at the end of bdrv_reopen_multiple().
> 
> That is correct.
> 
> >>    2) At the end of bdrv_reopen_multiple(), bdrv_drain_all_end() would
> >>       not be doing all necessary bdrv_parent_drained_end() calls.
> >
> > If I understand correctly, you don't have a reproducer here.
> 
> That's unfortunately not correct.
> 
> You can use the very test case that I mentioned in the commit message:
> 
>    https://lists.gnu.org/archive/html/qemu-block/2017-11/msg00934.html
> 
> With that one, QEMU master crashes easily because of problem (1). If I
> hold strong references in the reopen queue as you mentioned, the test
> case hangs because of problem (2).

Ok, thanks. I'll try to play with this a bit myself later.

> > It's certainly not a full solution because keeping a reference to a
> > block job does not prevent it from completing, but only from being
> > freed. Most block jobs do graph modifications, including dropping the
> > references to nodes, already when they complete, not only when they
> > are freed.
> 
> Yes but the block job itself holds additional references (thanks to
> block_job_add_bdrv()).

Mirror and commit call block_job_remove_all_bdrv() during their
completion. So yes, it does help for streaming, but not for all block
jobs.

Kevin

Reply via email to