On Wed, Jun 05, 2019 at 04:52:07PM +0800, Wei Yang wrote: > On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote: > >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote: > >> In case we gets a queued page, the order of block is interrupted. We may > >> not rely on the complete_round flag to say we have already searched the > >> whole blocks on the list. > >> > >> Signed-off-by: Wei Yang <richardw.y...@linux.intel.com> > >> --- > >> migration/ram.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/migration/ram.c b/migration/ram.c > >> index d881981876..e9b40d636d 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, > >> PageSearchStatus *pss) > >> */ > >> pss->block = block; > >> pss->page = offset >> TARGET_PAGE_BITS; > >> + > >> + /* > >> + * This unqueued page would break the "one round" check, even is > >> + * really rare. > > > >Why this is needed? Could you help explain the problem first? > > Peter, Thanks for your question. > > I found this issue during code review and I believe this is a corner case. > > Below is a draft chart for ram_find_and_save_block: > > ram_find_and_save_block > do > get_queued_page() > find_dirty_block() > ram_save_host_page() > while > > The basic logic here is : get a page need to migrate and migrate it. > > In case we don't have get_queued_page(), find_dirty_block() will search the > whole ram_list.blocks by order. pss->complete_round is used to indicate > whether this search has looped. > > Everything works fine after get_queued_page() involved. The block unqueued in > get_queued_page() could be any block in the ram_list.blocks. This means we > have very little chance to break the looped indicator. > > unqueue_page() last_seen_block > | | > ram_list.blocks v v > ---------------------------------+=====+--- > > > Just draw a raw picture to demonstrate a corner case. > > For example, we start from last_seen_block and search till the end of > ram_list.blocks. At this moment, pss->complete_round is set to true. Then we > get a queued page from unqueue_page() at the point I pointed. So the loop > continues may just continue the range as I marked as "=". We will skip all the > other ranges.
Ah I see your point, but I don't think there is a problem - note that complete_round will be reset for each ram_find_and_save_block(), so even if we have that iteration of ram_find_and_save_block() to return we'll still know we have dirty pages to migrate and in the next call we'll be fine, no? -- Peter Xu