Am 01.07.2025 um 19:16 hat Kevin Wolf geschrieben:
> Am 30.05.2025 um 17:10 hat Fiona Ebner geschrieben:
> > This series is an attempt to fix a deadlock issue reported by Andrey
> > here [3].
> > 
> > bdrv_drained_begin() polls and is not allowed to be called with the
> > block graph lock held. Mark the function as GRAPH_UNLOCKED.
> > 
> > This alone does not catch the issue reported by Andrey, because there
> > is a bdrv_graph_rdunlock_main_loop() before bdrv_drained_begin() in
> > the function bdrv_change_aio_context(). That unlock is of course
> > ineffective if the exclusive lock is held, but it prevents TSA from
> > finding the issue.
> > 
> > Thus the bdrv_drained_begin() call from inside
> > bdrv_change_aio_context() needs to be moved up the call stack before
> > acquiring the locks. This is the bulk of the series.
> > 
> > Granular draining is not trivially possible, because many of the
> > affected functions can recursively call themselves.
> > 
> > In place where bdrv_drained_begin() calls were removed, assertions
> > are added, checking the quiesced_counter to ensure that the nodes
> > already got drained further up in the call stack.
> 
> I finished review for this series. I had some minor comments on patches
> 24, 27 and 41. Once we agree what to do there, I can probably just make
> any changes myself while applying.

I don't see any objections, so I just applied this and made all the
changes I had suggested.

Thanks, applied to the block branch.

Kevin


Reply via email to