> From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Coroutines aren't randomly assigned to threads, but threads actively
> > > enter coroutines. To my knowledge this happens only when starting a
> > > request (either vcpu or I/O thread; consistent per device) or by a
> > > callback when some event happens (only I/O thread). I can't see any
> > > non-determinism here.
> >
> > Behavior of coroutines looks strange for me.
> > Consider the code below (co_readv function of the replay driver).
> > In record mode it somehow changes the thread it assigned to.
> > Code in point A is executed in CPU thread and code in point B - in some 
> > other thread.
> > May this happen because this coroutine yields somewhere and its execution 
> > is restored
> > by aio_poll, which is called from iothread?
> > In this case event finishing callback cannot be executed deterministically
> > (always in CPU thread or always in IO thread).
> >
> > static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs,
> >     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
> > {
> >     BDRVBlkreplayState *s = bs->opaque;
> >     uint32_t reqid = request_id++;
> >     Request *req;
> > // A
> >     bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
> >
> >     if (replay_mode == REPLAY_MODE_RECORD) {
> >         replay_save_block_event(reqid);
> >     } else {
> >         assert(replay_mode == REPLAY_MODE_PLAY);
> >         if (reqid == current_request) {
> >             current_finished = true;
> >         } else {
> >             req = block_request_insert(reqid, bs, qemu_coroutine_self());
> >             qemu_coroutine_yield();
> >             block_request_remove(req);
> >         }
> >     }
> > // B
> >     return 0;
> > }
> 
> Yes, I guess this can happen. As I described above, the coroutine can be
> entered from a vcpu thread initially. After yielding for the first time,
> it is resumed from the I/O thread. So if there are paths where the
> coroutine never yields, the coroutine completes in the original vcpu
> thread. (It's not the common case that bdrv_co_readv() doesn't yield,
> but it happens e.g. with unallocated sectors in qcow2.)
> 
> If this is a problem for you, you need to force the coroutine into the
> I/O thread. You can do that by scheduling a BH, then yield, and then let
> the BH reenter the coroutine.

Thanks, this approach seems to work. I got rid of replay_run_block_event,
because BH basically does the same job.

There is one problem with flush event - callbacks for flush are called for
all layers and I couldn't synchronize them correctly yet.
I'll probably have to add new callback to block driver, which handles
flush request for the whole stack of the drivers.

Pavel Dovgalyuk


Reply via email to