On Wed, Jun 05, 2019 at 02:41:08PM +0800, Peter Xu wrote: >On Wed, Jun 05, 2019 at 09:08:28AM +0800, Wei Yang wrote: >> In case we gets a queued page, the order of block is interrupted. We may >> not rely on the complete_round flag to say we have already searched the >> whole blocks on the list. >> >> Signed-off-by: Wei Yang <richardw.y...@linux.intel.com> >> --- >> migration/ram.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index d881981876..e9b40d636d 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -2290,6 +2290,12 @@ static bool get_queued_page(RAMState *rs, >> PageSearchStatus *pss) >> */ >> pss->block = block; >> pss->page = offset >> TARGET_PAGE_BITS; >> + >> + /* >> + * This unqueued page would break the "one round" check, even is >> + * really rare. > >Why this is needed? Could you help explain the problem first?
Peter, Thanks for your question. I found this issue during code review and I believe this is a corner case. Below is a draft chart for ram_find_and_save_block: ram_find_and_save_block do get_queued_page() find_dirty_block() ram_save_host_page() while The basic logic here is : get a page need to migrate and migrate it. In case we don't have get_queued_page(), find_dirty_block() will search the whole ram_list.blocks by order. pss->complete_round is used to indicate whether this search has looped. Everything works fine after get_queued_page() involved. The block unqueued in get_queued_page() could be any block in the ram_list.blocks. This means we have very little chance to break the looped indicator. unqueue_page() last_seen_block | | ram_list.blocks v v ---------------------------------+=====+--- Just draw a raw picture to demonstrate a corner case. For example, we start from last_seen_block and search till the end of ram_list.blocks. At this moment, pss->complete_round is set to true. Then we get a queued page from unqueue_page() at the point I pointed. So the loop continues may just continue the range as I marked as "=". We will skip all the other ranges. This is really a corner case, since ram_save_host_page() should return 0 and there should be no dirty page in this range. But I don't see we may avoid this case. If I am not correct, just let me know :-) > >Thanks, > >-- >Peter Xu -- Wei Yang Help you, Help me