Am 12.05.2016 um 15:47 hat Alberto Garcia geschrieben: > On Tue 03 May 2016 03:48:47 PM CEST, Kevin Wolf wrote: > > Am 03.05.2016 um 15:33 hat Alberto Garcia geschrieben: > >> On Tue 03 May 2016 03:23:24 PM CEST, Kevin Wolf wrote: > >> >> c) we fix bdrv_reopen() so we can actually run both jobs at the same > >> >> time. I'm wondering if pausing all block jobs between > >> >> bdrv_reopen_prepare() and bdrv_reopen_commit() would do the > >> >> trick. Opinions? > >> > > >> > I would have to read up the details of the problem again, but I think > >> > with bdrv_drained_begin/end() we actually have the right tool now to fix > >> > it properly. We may need to pull up the drain (bdrv_drain_all() today) > >> > from bdrv_reopen_multiple() to its caller and just assert it in the > >> > function itself, but there shouldn't be much more to it than that. > >> > >> I think that's not enough, see point 2) here: > >> > >> https://lists.gnu.org/archive/html/qemu-block/2015-12/msg00180.html > >> > >> "I've been taking a look at the bdrv_drained_begin/end() API, but as I > >> understand it it prevents requests from a different AioContext. > >> Since all BDS in the same chain share the same context it does not > >> really help here." > > > > Yes, that's the part I meant with pulling up the calls. > > > > If I understand correctly, the problem is that first bdrv_reopen_queue() > > queues a few BDSes for reopen, then bdrv_drain_all() completes all > > running requests and can indirectly trigger a graph modification, and > > then bdrv_reopen_multiple() uses the queue which doesn't match reality > > any more. > > > > The solution to that should be simply changing the order of things: > > > > 1. bdrv_drained_begin() > > 2. bdrv_reopen_queue() > > 3. bdrv_reopen_multiple() > > * Instead of bdrv_drain_all(), assert that no requests are pending > > * We don't run requests, so we can't complete a block job and > > manipulate the graph any more > > 4. then bdrv_drained_end() > > This doesn't work. Here's what happens: > > 1) Block job (a) starts (block-stream). > > 2) Block job (b) starts (block-stream, or block-commit). > > 3) job (b) calls bdrv_reopen() and does the drain call. > > 4) job (b) creates reopen_queue and calls bdrv_reopen_multiple(). > There are no pending requests at this point, but job (a) is sleeping. > > 5) bdrv_reopen_multiple() iterates over reopen_queue and calls > bdrv_reopen_prepare() -> bdrv_flush() -> bdrv_co_flush() -> > qemu_coroutine_yield().
I think between here and the next step is what I don't understand. bdrv_reopen_multiple() is not called in coroutine context, right? All block jobs use block_job_defer_to_main_loop() before they call bdrv_reopen(), as far as I can see. So bdrv_flush() shouldn't take the shortcut, but use a nested event loop. What is it that calls into job (a) from that event loop? It can't be a request completion because we already drained all requests. Is it a timer? > 6) job (a) resumes, finishes the job and removes nodes from the graph. > > 7) job (b) continues with bdrv_reopen_multiple() but now reopen_queue > contains invalid pointers. I don't fully understand the problem yet, but as a shot in the dark, would pausing block jobs in bdrv_drained_begin() help? Kevin