On Thu, Aug 03, 2017 at 12:56:31PM +0100, Dr. David Alan Gilbert wrote: [...]
> > @@ -256,6 +257,8 @@ struct RAMState { > > RAMBlock *last_req_rb; > > /* Queue of outstanding page requests from the destination */ > > QemuMutex src_page_req_mutex; > > + /* Ramblock counts to sync dirty bitmap. Only used for recovery */ > > + int ramblock_to_sync; > > QSIMPLEQ_HEAD(src_page_requests, RAMSrcPageRequest) src_page_requests; > > }; > > typedef struct RAMState RAMState; > > @@ -2731,6 +2734,57 @@ static int ram_load(QEMUFile *f, void *opaque, int > > version_id) > > return ret; > > } > > > > +/* Sync all the dirty bitmap with destination VM. */ > > +static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs) > > +{ > > + RAMBlock *block; > > + QEMUFile *file = s->to_dst_file; > > + int ramblock_count = 0; > > + > > + trace_ram_dirty_bitmap_sync("start"); > > Most (but not all) of our trace_ uses have separate trace_ entries for > each step; e.g. trace_ram_dirty_bitmap_sync_start9) Okay. > > > + /* > > + * We need to take the resume lock to make sure that the send > > + * thread (current thread) and the rp-thread will do their work in > > + * order. > > + */ > > + qemu_mutex_lock(&s->resume_lock); > > + > > + /* Request for receive-bitmap for each block */ > > + RAMBLOCK_FOREACH(block) { > > + ramblock_count++; > > + qemu_savevm_send_recv_bitmap(file, block->idstr); > > + } > > + > > + /* Init the ramblock count to total */ > > + atomic_set(&rs->ramblock_to_sync, ramblock_count); > > + > > + trace_ram_dirty_bitmap_sync("wait-bitmap"); > > + > > + /* Wait until all the ramblocks' dirty bitmap synced */ > > + while (rs->ramblock_to_sync) { > > + qemu_cond_wait(&s->resume_cond, &s->resume_lock); > > + } > > Does the locking here get simpler if you: > a) count the number of RAMBlocks 'n' > b) Initialise a sempahore to -(n-1) > c) Call qemu_savevm_send_recv_bitmap for each bitmap > d) sem_wait on the semaphore - which is waiting for the semaphore to > be >0 > > as you receive each bitmap do a sem_post; on the last one > it should go from 0->1 and the sem_wait should wake up? I think you are right. :-) A single semaphore suffice here (and also for the following up handshake). I will touch up the commit message as well. Thanks, -- Peter Xu