On Wed, Nov 25, 2020 at 07:45:30AM +0100, David Hildenbrand wrote:
> > Something must have changed more recently than v5.1 that caused the
> > zoneid of reserved pages to be wrong, a possible candidate for the
> > real would be this change below:
> > 
> > +               __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
> > 
> 
> Before that change, the memmap of memory holes were only zeroed out. So the 
> zones/nid was 0, however, pages were not reserved and had a refcount of zero 
> - resulting in other issues.
> 
> Most pfn walkers shouldn???t mess with reserved pages and simply skip them. 
> That would be the right fix here.
> 

Ordinarily yes, pfn walkers should not care about reserved pages but it's
still surprising that the node/zone linkages would be wrong for memory
holes. If they are in the middle of a zone, it means that a hole with
valid struct pages could be mistaken for overlapping nodes (if the hole
was in node 1 for example) or overlapping zones which is just broken.

> > 
> > Whenever pfn_valid is true, it's better that the zoneid/nid is correct
> > all times, otherwise if the second stage fails we end up in a bug with
> > weird side effects.
> 
> Memory holes with a valid memmap might not have a zone/nid. For now, skipping 
> reserved pages should be good enough, no?
> 

It would partially paper over the issue that setting the pageblock type
based on a reserved page. I agree that compaction should not be returning
pfns that are outside of the zone range because that is buggy in itself
but valid struct pages should have valid information. I don't think we
want to paper over that with unnecessary PageReserved checks.

-- 
Mel Gorman
SUSE Labs

Reply via email to