On Thu, Aug 13, 2015 at 10:41:50AM +0200, Kevin Wolf wrote: > This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero > write on target if sectors not allocated"), which was reported to cause > aborts with the message "Co-routine re-entered recursively". > > The cause for this bug is the following code in mirror_iteration_done(): > > if (s->common.busy) { > qemu_coroutine_enter(s->common.co, NULL); > } > > This has always been ugly because - unlike most places that reenter - it > doesn't have a specific yield that it pairs with, but is more > uncontrolled. What we really mean here is "reenter the coroutine if > it's in one of the four explicit yields in mirror.c". > > This used to be equivalent with s->common.busy because neither > mirror_run() nor mirror_iteration() call any function that could yield. > However since commit dcfb3beb this doesn't hold true any more: > bdrv_get_block_status_above() can yield. > > So what happens is that bdrv_get_block_status_above() wants to take a > lock that is already held, so it adds itself to the queue of waiting > coroutines and yields. Instead of being woken up by the unlock function, > however, it gets woken up by mirror_iteration_done(), which is obviously > wrong. > > In most cases the code actually happens to cope fairly well with such > cases, but in this specific case, the unlock must already have scheduled > the coroutine for wakeup when mirror_iteration_done() reentered it. And > then the coroutine happened to process the scheduled restarts and tried > to reenter itself recursively. > > This patch fixes the problem by pairing the reenter in > mirror_iteration_done() with specific yields instead of abusing > s->common.busy. > > Cc: qemu-sta...@nongnu.org > Signed-off-by: Kevin Wolf <kw...@redhat.com>
Thanks, applied to my block branch: https://github.com/codyprime/qemu-kvm-jtc/tree/block -Jeff