replay

Pavel Dovgalyuk Thu, 25 Feb 2016 01:08:24 -0800

> From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Coroutines aren't randomly assigned to threads, but threads actively
> > > enter coroutines. To my knowledge this happens only when starting a
> > > request (either vcpu or I/O thread; consistent per device) or by a
> > > callback when some event happens (only I/O thread). I can't see any
> > > non-determinism here.
> >
> > Behavior of coroutines looks strange for me.
> > Consider the code below (co_readv function of the replay driver).
> > In record mode it somehow changes the thread it assigned to.
> > Code in point A is executed in CPU thread and code in point B - in some 
> > other thread.
> > May this happen because this coroutine yields somewhere and its execution 
> > is restored
> > by aio_poll, which is called from iothread?
> > In this case event finishing callback cannot be executed deterministically
> > (always in CPU thread or always in IO thread).
> >
> > static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs,
> >     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
> > {
> >     BDRVBlkreplayState *s = bs->opaque;
> >     uint32_t reqid = request_id++;
> >     Request *req;
> > // A
> >     bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
> >
> >     if (replay_mode == REPLAY_MODE_RECORD) {
> >         replay_save_block_event(reqid);
> >     } else {
> >         assert(replay_mode == REPLAY_MODE_PLAY);
> >         if (reqid == current_request) {
> >             current_finished = true;
> >         } else {
> >             req = block_request_insert(reqid, bs, qemu_coroutine_self());
> >             qemu_coroutine_yield();
> >             block_request_remove(req);
> >         }
> >     }
> > // B
> >     return 0;
> > }
> 
> Yes, I guess this can happen. As I described above, the coroutine can be
> entered from a vcpu thread initially. After yielding for the first time,
> it is resumed from the I/O thread. So if there are paths where the
> coroutine never yields, the coroutine completes in the original vcpu
> thread. (It's not the common case that bdrv_co_readv() doesn't yield,
> but it happens e.g. with unallocated sectors in qcow2.)
> 
> If this is a problem for you, you need to force the coroutine into the
> I/O thread. You can do that by scheduling a BH, then yield, and then let
> the BH reenter the coroutine.


Thanks, this approach seems to work. I got rid of replay_run_block_event,
because BH basically does the same job.

There is one problem with flush event - callbacks for flush are called for
all layers and I couldn't synchronize them correctly yet.
I'll probably have to add new callback to block driver, which handles
flush request for the whole stack of the drivers.

Pavel Dovgalyuk

Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay

Reply via email to