On Fri, Mar 17, 2017 at 12:27 PM, Paolo Bonzini <pbonz...@redhat.com> wrote:
> And this is a fix, but I have no idea why/how it works and what else it
> may break.
>
> Patches 1 and 2 are pretty obvious and would be the first step towards
> eliminating aio_disable/enable_external altogether.
>
> However I got patch 3 more or less by trial and error, and when I
> thought I had the reasoning right I noticed this:
>
>         bdrv_drained_end(state->old_bs);
>
> in external_snapshot_clean which makes no sense given the
>
>         bdrv_drained_begin(bs_new);
>
> that I added to bdrv_append.  So take this with a ton of salt.
>
> The basic idea is that calling child->role->drained_begin and
> child->role->drained_end is not necessary and in fact actively wrong
> when both the old and the new child should be in a drained section.
> But maybe instead it should be asserted that they are, except for the
> special case of adding or removing a child.  i.e. after
>
>     int drain = !!(old_bs && old_bs->quiesce_counter) - !!(new_bs && 
> new_bs->quiesce_counter);
>
> add
>
>     assert(!(drain && old_bs && new_bs));
>
> Throwing this out because it's Friday evening... Maybe Fam can pick
> it up on Monday.

OK, thanks. It would be good to figure this out for 2.9, since the
workaround of disabling iothreads will affect performance. Let me know
if there's anything I can do to help.

Meanwhile I'm also looking into an intermittent crash running
block-commit on an external snapshot. This is with an earlier QEMU
snapshot (from around 20 Jan):

  /x/qemu/deb/debuild/block.c:2433: bdrv_append: Assertion
`!bdrv_requests_pending(bs_top)' failed.

That assertion no longer exists in the current master, but I'm trying
to reproduce it reliably and see whether the bug itself has
disappeared.

--Ed

Reply via email to