On Sun, Sep 2, 2012 at 10:46 AM, Yinghai Lu <[email protected]> wrote: > From: Jacob Shin <[email protected]> > > Currently direct mappings are created for [ 0 to max_low_pfn<<PAGE_SHIFT ) > and [ 4GB to max_pfn<<PAGE_SHIFT ), which may include regions that are not > backed by actual DRAM. This is fine for holes under 4GB which are covered > by fixed and variable range MTRRs to be UC. However, we run into trouble > on higher memory addresses which cannot be covered by MTRRs. > > Our system with 1TB of RAM has an e820 that looks like this: > > BIOS-e820: [mem 0x0000000000000000-0x00000000000983ff] usable > BIOS-e820: [mem 0x0000000000098400-0x000000000009ffff] reserved > BIOS-e820: [mem 0x00000000000d0000-0x00000000000fffff] reserved > BIOS-e820: [mem 0x0000000000100000-0x00000000c7ebffff] usable > BIOS-e820: [mem 0x00000000c7ec0000-0x00000000c7ed7fff] ACPI data > BIOS-e820: [mem 0x00000000c7ed8000-0x00000000c7ed9fff] ACPI NVS > BIOS-e820: [mem 0x00000000c7eda000-0x00000000c7ffffff] reserved > BIOS-e820: [mem 0x00000000fec00000-0x00000000fec0ffff] reserved > BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved > BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved > BIOS-e820: [mem 0x0000000100000000-0x000000e037ffffff] usable > BIOS-e820: [mem 0x000000e038000000-0x000000fcffffffff] reserved > BIOS-e820: [mem 0x0000010000000000-0x0000011ffeffffff] usable > > and so direct mappings are created for huge memory hole between > 0x000000e038000000 to 0x0000010000000000. Even though the kernel never > generates memory accesses in that region, since the page tables mark > them incorrectly as being WB, our (AMD) processor ends up causing a MCE > while doing some memory bookkeeping/optimizations around that area. > > This patch iterates through e820 and only direct maps ranges that are > marked as E820_RAM, and keeps track of those pfn ranges. Depending on > the alignment of E820 ranges, this may possibly result in using smaller > size (i.e. 4K instead of 2M or 1G) page tables. > > -v2: move changes from setup.c to mm/init.c, also use for_each_mem_pfn_range > instead. - Yinghai Lu > -v3: add calculate_all_table_space_size() to get correct needed page table > size. - Yinghai Lu > > Signed-off-by: Jacob Shin <[email protected]>
Yinghai's sign-off is missing. Reviewed-by: Pekka Enberg <[email protected]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

