On 16.07.19 16:40, Kevin Wolf wrote: > Am 19.06.2019 um 17:25 hat Max Reitz geschrieben: >> Hi, >> >> This is v2 to “block: Keep track of parent quiescing”. >> >> Please read this cover letter, because I’m very unsure about the design >> in this series and I’d appreciate some comments. >> >> As Kevin wrote in his reply to that series, the actual problem is that >> bdrv_drain_invoke() polls on every node whenever ending a drain. This >> may cause graph changes, which is actually forbidden. >> >> To solve that problem, this series makes the drain code construct a list >> of undrain operations that have been initiated, and then polls all of >> them on the root level once graph changes are acceptable. >> >> Note that I don’t like this list concept very much, so I’m open to >> alternatives. > > So drain_end is different from drain_begin in that it wants to wait only > for all bdrv_drain_invoke() calls to complete, but not for other > requests that are in flight. Makes sense. > > Though instead of managing a whole list, wouldn't a counter suffice? > >> Furthermore, all BdrvChildRoles with BDS parents have a broken >> .drained_end() implementation. The documentation clearly states that >> this function is not allowed to poll, but it does. So this series >> changes it to a variant (using the new code) that does not poll. >> >> There is a catch, which may actually be a problem, I don’t know: The new >> variant of that .drained_end() does not poll at all, never. As >> described above, now every bdrv_drain_invoke() returns an object that >> describes when it will be done and which can thus be polled for. These >> objects are just discarded when using BdrvChildRole.drained_end(). That >> does not feel quite right. It would probably be more correct to let >> BdrvChildRole.drained_end() return these objects so the top level >> bdrv_drained_end() can poll for their completion. >> >> I decided not to do this, for two reasons: >> (1) Doing so would spill the “list of objects to poll for” design to >> places outside of block/io.c. I don’t like the design very much as >> it is, but I can live with it as long as it’s constrained to the >> core drain code in block/io.c. >> This is made worse by the fact that currently, those objects are of >> type BdrvCoDrainData. But it shouldn’t be a problem to add a new >> type that is externally visible (we only need the AioContext and >> whether bdrv_drain_invoke_entry() is done). >> >> (2) It seems to work as it is. >> >> The alternative would be to add the same GSList ** parameter to >> BdrvChildRole.drained_end() that I added in the core drain code in patch >> 2, and then let the .drained_end() implementation fill that with objects >> to poll for. (Which would be accomplished by adding a frontend to >> bdrv_do_drained_end() that lets bdrv_child_cb_drained_poll() pass the >> parameter through.) >> >> Opinions? > > I think I would add an int* to BdrvChildRole.drained_end() so that we > can just increase the counter whereever we need to.
So you mean just polling the @bs for which a caller gave poll=true until the counter reaches 0? I’ll try, sounds good (if I can get it to work). >> And then we have bdrv_replace_child_noperm(), which actually wants a >> polling BdrvChildRole.drained_end(). So this series adds >> BdrvChildRole.drained_end_unquiesce(), which takes that role (if there >> is any polling to do). >> >> Note that if I implemented the alternative described above >> (.drained_end() gets said GSList ** parameter), a >> .drained_end_unquiesce() wouldn’t be necessary. >> bdrv_parent_drained_end_single() could just poll the list returned by >> .drained_end() by itself. > > The split between .drained_end/.drained_end_unquiesce feels wrong. It > shouldn't be the job of the BdrvChildRole to worry about this. Polling > should be handled inside bdrv_parent_drained_end_single(), like we do in > bdrv_parent_drained_begin_single(), so that the BdrvChildRole never has > to poll. If it’s just an int pointer, there’s no point in having two functions, I suppose. Thanks for you suggestion! >> Finally, patches 1, 8, and 9 are unmodified from v1. >> [...] >> >> include/block/block.h | 22 +++++- >> include/block/block_int.h | 23 ++++++ >> block.c | 24 +++--- >> block/io.c | 155 ++++++++++++++++++++++++++++++------- >> python/qemu/__init__.py | 5 +- >> tests/qemu-iotests/040 | 40 +++++++++- >> tests/qemu-iotests/040.out | 4 +- >> tests/qemu-iotests/255 | 2 +- >> 8 files changed, 231 insertions(+), 44 deletions(-) > > I feel this series should add something to tests/test-bdrv-drain.c, too. > qemu-iotests can only test high-level block job commands that happen to > trigger the bug today, but that code may change in the future. Unit > tests allow us to test the problematic cases more directly. Well, I’m glad if test-bdrv-drain just keeps working. :-) I’ll try. Max
signature.asc
Description: OpenPGP digital signature