> Am 13.03.2021 um 05:04 schrieb Liang, Liang (Leo) <liang.li...@amd.com>:
> 
> [AMD Public Use]
> 
> Hi David,
> 
> Which benchmark tool you prefer? Memtest86+ or else?

Hi Leo,

I think you want something that runs under Linux natively.

I‘m planning on coding up a kernel module to walk all 4MB pages in the 
freelists and perform a stream benchmark individually. Then we might be able to 
identify the problematic range - if there is a problematic range :) Guess I‘ll 
have it running by Monday and let you know.

Cheers!

> 
> BRs,
> Leo
> -----Original Message-----
> From: David Hildenbrand <da...@redhat.com> 
> Sent: Saturday, March 13, 2021 12:47 AM
> To: Liang, Liang (Leo) <liang.li...@amd.com>; Deucher, Alexander 
> <alexander.deuc...@amd.com>; linux-kernel@vger.kernel.org; amd-gfx list 
> <amd-...@lists.freedesktop.org>; Andrew Morton <a...@linux-foundation.org>
> Cc: Huang, Ray <ray.hu...@amd.com>; Koenig, Christian 
> <christian.koe...@amd.com>; Mike Rapoport <r...@linux.ibm.com>; Rafael J. 
> Wysocki <raf...@kernel.org>; George Kennedy <george.kenn...@oracle.com>
> Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail 
> in __free_pages_core()")
> 
>> On 12.03.21 17:19, Liang, Liang (Leo) wrote:
>> [AMD Public Use]
>> 
>> Dmesg attached.
>> 
> 
> 
> So, looks like the "real" slowdown starts once the buddy is up and running 
> (no surprise).
> 
> 
> [    0.044035] Memory: 6856724K/7200304K available (14345K kernel code, 9699K 
> rwdata, 5276K rodata, 2628K init, 12104K bss, 343324K reserved, 0K 
> cma-reserved)
> [    0.044045] random: get_random_u64 called from 
> __kmem_cache_create+0x33/0x460 with crng_init=1
> [    0.049025] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.050036] ftrace: allocating 47158 entries in 185 pages
> [    0.097487] ftrace: allocated 185 pages with 5 groups
> [    0.109210] rcu: Hierarchical RCU implementation.
> 
> vs.
> 
> [    0.041115] Memory: 6869396K/7200304K available (14345K kernel code, 3433K 
> rwdata, 5284K rodata, 2624K init, 6088K bss, 330652K reserved, 0K 
> cma-reserved)
> [    0.041127] random: get_random_u64 called from 
> __kmem_cache_create+0x31/0x430 with crng_init=1
> [    0.041309] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1
> [    0.041335] ftrace: allocating 47184 entries in 185 pages
> [    0.055719] ftrace: allocated 185 pages with 5 groups
> [    0.055863] rcu: Hierarchical RCU implementation.
> 
> 
> And it gets especially bad during ACPI table processing:
> 
> [    4.158303] ACPI: Added _OSI(Module Device)
> [    4.158767] ACPI: Added _OSI(Processor Device)
> [    4.159230] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    4.159705] ACPI: Added _OSI(Processor Aggregator Device)
> [    4.160551] ACPI: Added _OSI(Linux-Dell-Video)
> [    4.161359] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    4.162264] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [   17.713421] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [   18.716065] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [   20.743828] ACPI: EC: EC started
> [   20.744155] ACPI: EC: interrupt blocked
> [   20.945956] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [   20.946618] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle 
> transactions
> [   20.947348] ACPI: Interpreter enabled
> [   20.951278] ACPI: (supports S0 S3 S4 S5)
> [   20.951632] ACPI: Using IOAPIC for interrupt routing
> 
> vs.
> 
> [    0.216039] ACPI: Added _OSI(Module Device)
> [    0.216041] ACPI: Added _OSI(Processor Device)
> [    0.216043] ACPI: Added _OSI(3.0 _SCP Extensions)
> [    0.216044] ACPI: Added _OSI(Processor Aggregator Device)
> [    0.216046] ACPI: Added _OSI(Linux-Dell-Video)
> [    0.216048] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [    0.216049] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [    0.228259] ACPI: 13 ACPI AML tables successfully acquired and loaded
> [    0.229527] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
> [    0.231663] ACPI: EC: EC started
> [    0.231666] ACPI: EC: interrupt blocked
> [    0.233664] ACPI: EC: EC_CMD/EC_SC=0x666, EC_DATA=0x662
> [    0.233667] ACPI: \_SB_.PCI0.LPC0.EC0_: Boot DSDT EC used to handle 
> transactions
> [    0.233670] ACPI: Interpreter enabled
> [    0.233685] ACPI: (supports S0 S3 S4 S5)
> [    0.233687] ACPI: Using IOAPIC for interrupt routing
> 
> The jump from 4.1 -> 17.7 is especially bad.
> 
> Which might in fact indicate that this could be related to using some very 
> special slow (ACPI?) memory for ordinary purposes, interfering with actual 
> ACPI users?
> 
> But again, just a wild guess, because the system is extremely slow 
> afterwards, however, we don't have any pauses without any signs of life for 
> that long.
> 
> 
> It would be interesting to run a simple memory bandwidth benchmark on the 
> fast kernel with differing sizes up to running OOM to see if there is really 
> some memory that is just horribly slow once allocated and used.
> 
> --
> Thanks,
> 
> David / dhildenb
> 

Reply via email to