On 2015-03-02 at 04:25, Kevin Wolf wrote:
Am 27.02.2015 um 17:43 hat Max Reitz geschrieben:
Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
which can lead to data corruption (see the iotest added in the final
patch of this series) and is most certainly very ugly.

This series reworks bdrv_close_all() to instead eject the BDS trees from
all BlockBackends and then close the monitor-owned BDS trees, which are
the only BDSs without a BB. In effect, all BDSs are closed just by
getting closed automatically due to their reference count becoming 0.

The benefit over the approach taken in v1 and v2 is that in device
models we often cannot simply drop the reference to a BB because there
may be some user which we forgot about. By ejecting the BDS trees from
the BB, the BB itself becomes unusable, but in a clean way (it will
return errors when accessed, but nothing will crash). Also, it is much
simpler (no reference tracking necessary).

The only disadvantage (I can see) is that the BBs are leaked; but this
does not matter because the qemu process is about to exit anyway.
I haven't looked at the actual patches yet, but just from this
description and the diffstat: We need to make sure that the refcount
really drops to 0.

If you mean the BBs: Tried that in v2, doesn't seem to work. I had my fun tracing around where the handles to the BBs for device models stay and who might have access to them, but when I say "fun" I mean that ironically.

If you mean the BDSs: See below (below below).

That is, if there are NBD servers, block jobs, etc.
that hold an additional reference, we must make sure to stop them. It
doesn't look like this series takes care of this, does it?

Everyone using BBs is fine; for instance, NBD servers are stopped (they register a notifier for when the BDS tree is ejected from a BB).

So block jobs are not handled, once because I still didn't get around to make them work on BBs instead of BDSs (and I probably never will, although they really should), and because I missed the bdrv_ref() in block_job_create()

I guess that means manually cancelling all block jobs in bdrv_close_all()...

Hm... Perhaps we could even install an atexit handler that asserts that
all BDSes are really gone in the end?

I had that for BBs in v1 and v2, it was just a function that was called at the end of bdrv_close_all(). For that to work for BDSs, however, we'd need a list of all BDSs and assert that it's empty. I found that a bit too much.

A simpler but uglier way would be a counter that's incremented every time a BDS is created and decremented every time one is deleted (or incremented once per refcount increment and drecrement once per refcount decrement). We'd then assert that this counter is 0.

Max

Reply via email to