Hi Muchun,

Muchun Song <[email protected]> wrote:
> mm/sparse: Move subsection_map_init() into sparse_init()
>
> This commit moves subsection_map_init() from free_area_init() into
> sparse_init() so that sparse-specific setup stays together instead of being
> split across the generic free_area_init() path.

This patch introduces a new `sparse_init_subsection_map()` that iterates
over all memblock ranges and calls `sparse_init_subsection_map_range()`:

> +void __init sparse_init_subsection_map(void)
> +{
> +    int i, nid;
> +    unsigned long start, end;
> +
> +    for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, &nid)
> +        sparse_init_subsection_map_range(start, end - start);

However, earlier in `sparse_init()`, `memblocks_present()` calls
`memory_present()`, which internally caps PFN ranges at
`max_sparsemem_pfn` via `mminit_validate_memmodel_limits()`. Sections
beyond this cap never have `ms->usage` allocated.

`for_each_mem_pfn_range()` returns the raw, uncapped memblock ranges.
If a range extends beyond `max_sparsemem_pfn`, then inside
`sparse_init_subsection_map_range()`:

    ms = __nr_to_section(nr);
    subsection_mask_set(ms->usage->subsection_map, pfn, pfns);

`ms->usage` is NULL because `sparse_init_early_section()` was never
called for this section, causing a NULL pointer dereference.

I was able to reproduce this on x86_64 with 4-level paging by booting
with `memmap=4G@0x400080000000` to place a memblock range beyond the
~64 TiB `max_sparsemem_pfn` limit.  The kernel crashes during early boot:

  node  -1: [mem 0x0000400080000000-0x000040017fffffff]
  ------------[ cut here ]------------
  WARNING: mm/sparse.c:142 at sparse_init+0x1ac/0x8a0
   ...
  PANIC: early exception 0x0d IP 10:...sparse_init_subsection_map+0x12f/0x250
  RIP: 0010:sparse_init_subsection_map+0x12f/0x250
  Call Trace:
   sparse_init+0x69f/0x8a0
   mm_core_init_early+0x12fa/0x20c0
   start_kernel+0x89/0x4e0

The fix is a one-line NULL check in sparse_init_subsection_map_range():

--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -608,6 +608,8 @@ void __init sparse_init_subsection_map(unsigned long pfn,
         pfns = min(nr_pages, PAGES_PER_SECTION
                 - (pfn & ~PAGE_SECTION_MASK));
         ms = __nr_to_section(nr);
+        if (!ms->usage)
+            continue;
         subsection_mask_set(ms->usage->subsection_map, pfn, pfns);

On most systems `max_sparsemem_pfn` is large enough that this is never
hit, but on 32-bit or PAE configurations where the limit is much lower,
the mismatch between `for_each_mem_pfn_range()` and
`mminit_validate_memmodel_limits()` can trigger with reasonable memory
sizes.

Thanks,
Xiao



Reply via email to