Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On Fri, 21 Jun 2019 20:24:59 +0200 David Hildenbrand wrote: > @Qian Cai, unfortunately I can't reproduce. > > If you get the chance, it would be great if you could retry with > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index 972c5336bebf..742f99ddd148 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -868,6 +868,9 @@ int walk_memory_blocks(unsigned long start, unsigned > long size, > unsigned long block_id; > int ret = 0; > > + if (!size) > + return; > + > for (block_id = start_block_id; block_id <= end_block_id; > block_id++) { > mem = find_memory_block_by_id(block_id); > if (!mem) > > > > If both, start and size are 0, we would get a vry long loop. This > would mean that we have an online node that does not span any pages at > all (pgdat->node_start_pfn = 0, start_pfn + pgdat->node_spanned_pages = 0). I think I'll make that a `return 0' and I won't drop patches 4-6 for now, as we appear to have this fixed. From: David Hildenbrand Subject: drivers-base-memoryc-get-rid-of-find_memory_block_hinted-v3-fix handle zero-length walks Link: http://lkml.kernel.org/r/1c2edc22-afd7-2211-c4c7-40e54e500...@redhat.com Reported-by: Qian Cai Tested-by: Qian Cai Signed-off-by: Andrew Morton --- drivers/base/memory.c |3 +++ 1 file changed, 3 insertions(+) --- a/drivers/base/memory.c~drivers-base-memoryc-get-rid-of-find_memory_block_hinted-v3-fix +++ a/drivers/base/memory.c @@ -866,6 +866,9 @@ int walk_memory_blocks(unsigned long sta unsigned long block_id; int ret = 0; + if (!size) + return 0; + for (block_id = start_block_id; block_id <= end_block_id; block_id++) { mem = find_memory_block_by_id(block_id); if (!mem)
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On Fri, 2019-06-21 at 20:24 +0200, David Hildenbrand wrote: > On 21.06.19 17:15, Qian Cai wrote: > > On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: > > > @Andrew: Only patch 1, 4 and 6 changed compared to v1. > > > > > > Some further cleanups around memory block devices. Especially, clean up > > > and simplify walk_memory_range(). Including some other minor cleanups. > > > > > > Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. > > > > > > v2 -> v3: > > > - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." > > > -- Avoid warning on ppc. > > > - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" > > > -- Fixup a comment regarding hinted devices. > > > > > > v1 -> v2: > > > - "mm: Section numbers use the type "unsigned long"" > > > -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" > > > - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" > > > -- Fix compilation error > > > -- Get rid of the "hint" parameter completely > > > > > > David Hildenbrand (6): > > > mm: Section numbers use the type "unsigned long" > > > drivers/base/memory: Use "unsigned long" for block ids > > > mm: Make register_mem_sect_under_node() static > > > mm/memory_hotplug: Rename walk_memory_range() and pass start+size > > > instead of pfns > > > mm/memory_hotplug: Move and simplify walk_memory_blocks() > > > drivers/base/memory.c: Get rid of find_memory_block_hinted() > > > > > > arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- > > > drivers/acpi/acpi_memhotplug.c| 19 +--- > > > drivers/base/memory.c | 120 +- > > > drivers/base/node.c | 8 +- > > > include/linux/memory.h| 5 +- > > > include/linux/memory_hotplug.h| 2 - > > > include/linux/mmzone.h| 4 +- > > > include/linux/node.h | 7 -- > > > mm/memory_hotplug.c | 57 +- > > > mm/sparse.c | 12 +-- > > > 10 files changed, 106 insertions(+), 151 deletions(-) > > > > > > > This series causes a few machines are unable to boot triggering endless soft > > lockups. Reverted those commits fixed the issue. > > > > 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass > > start+size instead of pfns" > > c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- > > startsize-instead-of-pfns-fix" > > 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify > > walk_memory_blocks()" > > 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of > > find_memory_block_hinted()" > > 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of- > > find_memory_block_hinted- > > v3" > > > > [4.582081][T1] ACPI FADT declares the system doesn't support PCIe > > ASPM, > > so disable it > > [4.590405][T1] ACPI: bus type PCI registered > > [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem > > 0x8000-0x8fff] (base 0x8000) > > [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved > > in > > E820 > > [4.601860][T1] PCI: Using configuration type 1 for base access > > [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! > > [swapper/0:1] > > [ 28.671351][ C16] Modules linked in: > > [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5- > > next-20190621+ #1 > > [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant > > DL385 > > Gen10, BIOS A40 03/09/2018 > > [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 > > [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 > > 8b > > 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d > > <65> > > ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 > > [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: > > ff13 > > [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: > > b6f2a3b8 > > [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: > > 8882053d6138 > > [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: > > ed1040a7ac27 > > [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: > > 0246 > > [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: > > > > [ 28.761368][ C16] FS: () GS:88845450() > > knlGS: > > [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 > > [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: > > 001406a0 > > [ 28.791333][ C16] Call Trace: > > [ 28.791374][ C16] klist_next+0xd8/0x1c0 > > [ 28.791374][ C16] subsys_find_device_by_id+0x13b/0x1f0 > > [ 28.801334][ C
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On 21.06.19 21:07, Qian Cai wrote: > On Fri, 2019-06-21 at 20:56 +0200, David Hildenbrand wrote: >> On 21.06.19 20:24, David Hildenbrand wrote: >>> On 21.06.19 17:15, Qian Cai wrote: On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: > @Andrew: Only patch 1, 4 and 6 changed compared to v1. > > Some further cleanups around memory block devices. Especially, clean up > and simplify walk_memory_range(). Including some other minor cleanups. > > Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. > > v2 -> v3: > - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." > -- Avoid warning on ppc. > - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" > -- Fixup a comment regarding hinted devices. > > v1 -> v2: > - "mm: Section numbers use the type "unsigned long"" > -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" > - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" > -- Fix compilation error > -- Get rid of the "hint" parameter completely > > David Hildenbrand (6): > mm: Section numbers use the type "unsigned long" > drivers/base/memory: Use "unsigned long" for block ids > mm: Make register_mem_sect_under_node() static > mm/memory_hotplug: Rename walk_memory_range() and pass start+size > instead of pfns > mm/memory_hotplug: Move and simplify walk_memory_blocks() > drivers/base/memory.c: Get rid of find_memory_block_hinted() > > arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- > drivers/acpi/acpi_memhotplug.c| 19 +--- > drivers/base/memory.c | 120 +- > drivers/base/node.c | 8 +- > include/linux/memory.h| 5 +- > include/linux/memory_hotplug.h| 2 - > include/linux/mmzone.h| 4 +- > include/linux/node.h | 7 -- > mm/memory_hotplug.c | 57 +- > mm/sparse.c | 12 +-- > 10 files changed, 106 insertions(+), 151 deletions(-) > This series causes a few machines are unable to boot triggering endless soft lockups. Reverted those commits fixed the issue. 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns" c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- startsize-instead-of-pfns-fix" 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify walk_memory_blocks()" 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of find_memory_block_hinted()" 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of- find_memory_block_hinted- v3" [4.582081][T1] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it [4.590405][T1] ACPI: bus type PCI registered [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem 0x8000-0x8fff] (base 0x8000) [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved in E820 [4.601860][T1] PCI: Using configuration type 1 for base access [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [swapper/0:1] [ 28.671351][ C16] Modules linked in: [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0- rc5- next-20190621+ #1 [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 03/09/2018 [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 8b 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d <65> ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: ff13 [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: b6f2a3b8 [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: 8882053d6138 [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: ed1040a7ac27 [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: 0246 [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: [ 28.761368][ C16] FS: () GS:88845450() knlGS: [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: 001406a0 [ 28.79133
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On Fri, 2019-06-21 at 20:56 +0200, David Hildenbrand wrote: > On 21.06.19 20:24, David Hildenbrand wrote: > > On 21.06.19 17:15, Qian Cai wrote: > > > On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: > > > > @Andrew: Only patch 1, 4 and 6 changed compared to v1. > > > > > > > > Some further cleanups around memory block devices. Especially, clean up > > > > and simplify walk_memory_range(). Including some other minor cleanups. > > > > > > > > Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. > > > > > > > > v2 -> v3: > > > > - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." > > > > -- Avoid warning on ppc. > > > > - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" > > > > -- Fixup a comment regarding hinted devices. > > > > > > > > v1 -> v2: > > > > - "mm: Section numbers use the type "unsigned long"" > > > > -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" > > > > - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" > > > > -- Fix compilation error > > > > -- Get rid of the "hint" parameter completely > > > > > > > > David Hildenbrand (6): > > > > mm: Section numbers use the type "unsigned long" > > > > drivers/base/memory: Use "unsigned long" for block ids > > > > mm: Make register_mem_sect_under_node() static > > > > mm/memory_hotplug: Rename walk_memory_range() and pass start+size > > > > instead of pfns > > > > mm/memory_hotplug: Move and simplify walk_memory_blocks() > > > > drivers/base/memory.c: Get rid of find_memory_block_hinted() > > > > > > > > arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- > > > > drivers/acpi/acpi_memhotplug.c| 19 +--- > > > > drivers/base/memory.c | 120 +- > > > > drivers/base/node.c | 8 +- > > > > include/linux/memory.h| 5 +- > > > > include/linux/memory_hotplug.h| 2 - > > > > include/linux/mmzone.h| 4 +- > > > > include/linux/node.h | 7 -- > > > > mm/memory_hotplug.c | 57 +- > > > > mm/sparse.c | 12 +-- > > > > 10 files changed, 106 insertions(+), 151 deletions(-) > > > > > > > > > > This series causes a few machines are unable to boot triggering endless > > > soft > > > lockups. Reverted those commits fixed the issue. > > > > > > 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and > > > pass > > > start+size instead of pfns" > > > c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- > > > startsize-instead-of-pfns-fix" > > > 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify > > > walk_memory_blocks()" > > > 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of > > > find_memory_block_hinted()" > > > 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of- > > > find_memory_block_hinted- > > > v3" > > > > > > [4.582081][T1] ACPI FADT declares the system doesn't support PCIe > > > ASPM, > > > so disable it > > > [4.590405][T1] ACPI: bus type PCI registered > > > [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem > > > 0x8000-0x8fff] (base 0x8000) > > > [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] > > > reserved in > > > E820 > > > [4.601860][T1] PCI: Using configuration type 1 for base access > > > [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! > > > [swapper/0:1] > > > [ 28.671351][ C16] Modules linked in: > > > [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0- > > > rc5- > > > next-20190621+ #1 > > > [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant > > > DL385 > > > Gen10, BIOS A40 03/09/2018 > > > [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 > > > [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 > > > 48 8b > > > 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d > > > <65> > > > ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 > > > [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 > > > ORIG_RAX: > > > ff13 > > > [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: > > > b6f2a3b8 > > > [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: > > > 8882053d6138 > > > [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: > > > ed1040a7ac27 > > > [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: > > > 0246 > > > [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: > > > > > > [ 28.761368][ C16] FS: () > > > GS:88845450() > > > knlGS: > > > [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 >
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On 21.06.19 20:24, David Hildenbrand wrote: > On 21.06.19 17:15, Qian Cai wrote: >> On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: >>> @Andrew: Only patch 1, 4 and 6 changed compared to v1. >>> >>> Some further cleanups around memory block devices. Especially, clean up >>> and simplify walk_memory_range(). Including some other minor cleanups. >>> >>> Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. >>> >>> v2 -> v3: >>> - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." >>> -- Avoid warning on ppc. >>> - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" >>> -- Fixup a comment regarding hinted devices. >>> >>> v1 -> v2: >>> - "mm: Section numbers use the type "unsigned long"" >>> -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" >>> - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" >>> -- Fix compilation error >>> -- Get rid of the "hint" parameter completely >>> >>> David Hildenbrand (6): >>> mm: Section numbers use the type "unsigned long" >>> drivers/base/memory: Use "unsigned long" for block ids >>> mm: Make register_mem_sect_under_node() static >>> mm/memory_hotplug: Rename walk_memory_range() and pass start+size >>> instead of pfns >>> mm/memory_hotplug: Move and simplify walk_memory_blocks() >>> drivers/base/memory.c: Get rid of find_memory_block_hinted() >>> >>> arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- >>> drivers/acpi/acpi_memhotplug.c| 19 +--- >>> drivers/base/memory.c | 120 +- >>> drivers/base/node.c | 8 +- >>> include/linux/memory.h| 5 +- >>> include/linux/memory_hotplug.h| 2 - >>> include/linux/mmzone.h| 4 +- >>> include/linux/node.h | 7 -- >>> mm/memory_hotplug.c | 57 +- >>> mm/sparse.c | 12 +-- >>> 10 files changed, 106 insertions(+), 151 deletions(-) >>> >> >> This series causes a few machines are unable to boot triggering endless soft >> lockups. Reverted those commits fixed the issue. >> >> 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass >> start+size instead of pfns" >> c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- >> startsize-instead-of-pfns-fix" >> 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify >> walk_memory_blocks()" >> 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of >> find_memory_block_hinted()" >> 5cfcd52288b6 Revert >> "drivers-base-memoryc-get-rid-of-find_memory_block_hinted- >> v3" >> >> [4.582081][T1] ACPI FADT declares the system doesn't support PCIe >> ASPM, >> so disable it >> [4.590405][T1] ACPI: bus type PCI registered >> [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem >> 0x8000-0x8fff] (base 0x8000) >> [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved >> in >> E820 >> [4.601860][T1] PCI: Using configuration type 1 for base access >> [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! >> [swapper/0:1] >> [ 28.671351][ C16] Modules linked in: >> [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5- >> next-20190621+ #1 >> [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 >> Gen10, BIOS A40 03/09/2018 >> [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 >> [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 >> 8b >> 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d >> <65> >> ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 >> [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: >> ff13 >> [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: >> b6f2a3b8 >> [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: >> 8882053d6138 >> [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: >> ed1040a7ac27 >> [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: >> 0246 >> [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: >> >> [ 28.761368][ C16] FS: () GS:88845450() >> knlGS: >> [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 >> [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: >> 001406a0 >> [ 28.791333][ C16] Call Trace: >> [ 28.791374][ C16] klist_next+0xd8/0x1c0 >> [ 28.791374][ C16] subsys_find_device_by_id+0x13b/0x1f0 >> [ 28.801334][ C16] ? bus_find_device_by_name+0x20/0x20 >> [ 28.801370][ C16] ? kobject_put+0x23/0x250 >> [ 28.811333][ C16] walk_memory_blocks+0x6c/0xb8 >> [ 28.811353][
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On 21.06.19 17:15, Qian Cai wrote: > On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: >> @Andrew: Only patch 1, 4 and 6 changed compared to v1. >> >> Some further cleanups around memory block devices. Especially, clean up >> and simplify walk_memory_range(). Including some other minor cleanups. >> >> Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. >> >> v2 -> v3: >> - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." >> -- Avoid warning on ppc. >> - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" >> -- Fixup a comment regarding hinted devices. >> >> v1 -> v2: >> - "mm: Section numbers use the type "unsigned long"" >> -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" >> - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" >> -- Fix compilation error >> -- Get rid of the "hint" parameter completely >> >> David Hildenbrand (6): >> mm: Section numbers use the type "unsigned long" >> drivers/base/memory: Use "unsigned long" for block ids >> mm: Make register_mem_sect_under_node() static >> mm/memory_hotplug: Rename walk_memory_range() and pass start+size >> instead of pfns >> mm/memory_hotplug: Move and simplify walk_memory_blocks() >> drivers/base/memory.c: Get rid of find_memory_block_hinted() >> >> arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- >> drivers/acpi/acpi_memhotplug.c| 19 +--- >> drivers/base/memory.c | 120 +- >> drivers/base/node.c | 8 +- >> include/linux/memory.h| 5 +- >> include/linux/memory_hotplug.h| 2 - >> include/linux/mmzone.h| 4 +- >> include/linux/node.h | 7 -- >> mm/memory_hotplug.c | 57 +- >> mm/sparse.c | 12 +-- >> 10 files changed, 106 insertions(+), 151 deletions(-) >> > > This series causes a few machines are unable to boot triggering endless soft > lockups. Reverted those commits fixed the issue. > > 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass > start+size instead of pfns" > c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- > startsize-instead-of-pfns-fix" > 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify > walk_memory_blocks()" > 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of > find_memory_block_hinted()" > 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of-find_memory_block_hinted- > v3" > > [4.582081][T1] ACPI FADT declares the system doesn't support PCIe > ASPM, > so disable it > [4.590405][T1] ACPI: bus type PCI registered > [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem > 0x8000-0x8fff] (base 0x8000) > [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved > in > E820 > [4.601860][T1] PCI: Using configuration type 1 for base access > [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! > [swapper/0:1] > [ 28.671351][ C16] Modules linked in: > [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5- > next-20190621+ #1 > [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 > Gen10, BIOS A40 03/09/2018 > [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 > [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 > 8b > 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d > <65> > ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 > [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: > ff13 > [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: > b6f2a3b8 > [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: > 8882053d6138 > [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: > ed1040a7ac27 > [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: > 0246 > [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: > > [ 28.761368][ C16] FS: () GS:88845450() > knlGS: > [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 > [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: > 001406a0 > [ 28.791333][ C16] Call Trace: > [ 28.791374][ C16] klist_next+0xd8/0x1c0 > [ 28.791374][ C16] subsys_find_device_by_id+0x13b/0x1f0 > [ 28.801334][ C16] ? bus_find_device_by_name+0x20/0x20 > [ 28.801370][ C16] ? kobject_put+0x23/0x250 > [ 28.811333][ C16] walk_memory_blocks+0x6c/0xb8 > [ 28.811353][ C16] ? write_policy_show+0x40/0x40 > [ 28.821334][ C16] link_mem_sections+0x7e/0xa0 > [ 28.821369][ C16] ? unregister_memory_block_under_n
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On 21.06.19 17:15, Qian Cai wrote: > On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: >> @Andrew: Only patch 1, 4 and 6 changed compared to v1. >> >> Some further cleanups around memory block devices. Especially, clean up >> and simplify walk_memory_range(). Including some other minor cleanups. >> >> Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. >> >> v2 -> v3: >> - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." >> -- Avoid warning on ppc. >> - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" >> -- Fixup a comment regarding hinted devices. >> >> v1 -> v2: >> - "mm: Section numbers use the type "unsigned long"" >> -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" >> - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" >> -- Fix compilation error >> -- Get rid of the "hint" parameter completely >> >> David Hildenbrand (6): >> mm: Section numbers use the type "unsigned long" >> drivers/base/memory: Use "unsigned long" for block ids >> mm: Make register_mem_sect_under_node() static >> mm/memory_hotplug: Rename walk_memory_range() and pass start+size >> instead of pfns >> mm/memory_hotplug: Move and simplify walk_memory_blocks() >> drivers/base/memory.c: Get rid of find_memory_block_hinted() >> >> arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- >> drivers/acpi/acpi_memhotplug.c| 19 +--- >> drivers/base/memory.c | 120 +- >> drivers/base/node.c | 8 +- >> include/linux/memory.h| 5 +- >> include/linux/memory_hotplug.h| 2 - >> include/linux/mmzone.h| 4 +- >> include/linux/node.h | 7 -- >> mm/memory_hotplug.c | 57 +- >> mm/sparse.c | 12 +-- >> 10 files changed, 106 insertions(+), 151 deletions(-) >> > > This series causes a few machines are unable to boot triggering endless soft > lockups. Reverted those commits fixed the issue. > > 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass > start+size instead of pfns" > c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- > startsize-instead-of-pfns-fix" > 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify > walk_memory_blocks()" > 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of > find_memory_block_hinted()" > 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of-find_memory_block_hinted- > v3" > > [4.582081][T1] ACPI FADT declares the system doesn't support PCIe > ASPM, > so disable it > [4.590405][T1] ACPI: bus type PCI registered > [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem > 0x8000-0x8fff] (base 0x8000) > [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved > in > E820 > [4.601860][T1] PCI: Using configuration type 1 for base access > [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! > [swapper/0:1] > [ 28.671351][ C16] Modules linked in: > [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5- > next-20190621+ #1 > [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 > Gen10, BIOS A40 03/09/2018 > [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 > [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 > 8b > 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d > <65> > ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 > [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: > ff13 > [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: > b6f2a3b8 > [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: > 8882053d6138 > [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: > ed1040a7ac27 > [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: > 0246 > [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: > > [ 28.761368][ C16] FS: () GS:88845450() > knlGS: > [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 > [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: > 001406a0 > [ 28.791333][ C16] Call Trace: > [ 28.791374][ C16] klist_next+0xd8/0x1c0 > [ 28.791374][ C16] subsys_find_device_by_id+0x13b/0x1f0 > [ 28.801334][ C16] ? bus_find_device_by_name+0x20/0x20 > [ 28.801370][ C16] ? kobject_put+0x23/0x250 > [ 28.811333][ C16] walk_memory_blocks+0x6c/0xb8 > [ 28.811353][ C16] ? write_policy_show+0x40/0x40 > [ 28.821334][ C16] link_mem_sections+0x7e/0xa0 > [ 28.821369][ C16] ? unregister_memory_block_under_n
Re: [PATCH v3 0/6] mm: Further memory block device cleanups
On Thu, 2019-06-20 at 20:31 +0200, David Hildenbrand wrote: > @Andrew: Only patch 1, 4 and 6 changed compared to v1. > > Some further cleanups around memory block devices. Especially, clean up > and simplify walk_memory_range(). Including some other minor cleanups. > > Compiled + tested on x86 with DIMMs under QEMU. Compile-tested on ppc64. > > v2 -> v3: > - "mm/memory_hotplug: Rename walk_memory_range() and pass start+size .." > -- Avoid warning on ppc. > - "drivers/base/memory.c: Get rid of find_memory_block_hinted()" > -- Fixup a comment regarding hinted devices. > > v1 -> v2: > - "mm: Section numbers use the type "unsigned long"" > -- "unsigned long i" -> "unsigned long nr", in one case -> "int i" > - "drivers/base/memory.c: Get rid of find_memory_block_hinted(" > -- Fix compilation error > -- Get rid of the "hint" parameter completely > > David Hildenbrand (6): > mm: Section numbers use the type "unsigned long" > drivers/base/memory: Use "unsigned long" for block ids > mm: Make register_mem_sect_under_node() static > mm/memory_hotplug: Rename walk_memory_range() and pass start+size > instead of pfns > mm/memory_hotplug: Move and simplify walk_memory_blocks() > drivers/base/memory.c: Get rid of find_memory_block_hinted() > > arch/powerpc/platforms/powernv/memtrace.c | 23 ++--- > drivers/acpi/acpi_memhotplug.c| 19 +--- > drivers/base/memory.c | 120 +- > drivers/base/node.c | 8 +- > include/linux/memory.h| 5 +- > include/linux/memory_hotplug.h| 2 - > include/linux/mmzone.h| 4 +- > include/linux/node.h | 7 -- > mm/memory_hotplug.c | 57 +- > mm/sparse.c | 12 +-- > 10 files changed, 106 insertions(+), 151 deletions(-) > This series causes a few machines are unable to boot triggering endless soft lockups. Reverted those commits fixed the issue. 97f4217d1da0 Revert "mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns" c608eebf33c6 Revert "mm-memory_hotplug-rename-walk_memory_range-and-pass- startsize-instead-of-pfns-fix" 34b5e4ab7558 Revert "mm/memory_hotplug: move and simplify walk_memory_blocks()" 59a9f3eec5d1 Revert "drivers/base/memory.c: Get rid of find_memory_block_hinted()" 5cfcd52288b6 Revert "drivers-base-memoryc-get-rid-of-find_memory_block_hinted- v3" [4.582081][T1] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it [4.590405][T1] ACPI: bus type PCI registered [4.592908][T1] PCI: MMCONFIG for domain [bus 00-ff] at [mem 0x8000-0x8fff] (base 0x8000) [4.601860][T1] PCI: MMCONFIG at [mem 0x8000-0x8fff] reserved in E820 [4.601860][T1] PCI: Using configuration type 1 for base access [ 28.661336][ C16] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [swapper/0:1] [ 28.671351][ C16] Modules linked in: [ 28.671354][ C16] CPU: 16 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5- next-20190621+ #1 [ 28.681366][ C16] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 03/09/2018 [ 28.691334][ C16] RIP: 0010:_raw_spin_unlock_irqrestore+0x2f/0x40 [ 28.701334][ C16] Code: 55 48 89 e5 41 54 49 89 f4 be 01 00 00 00 53 48 8b 55 08 48 89 fb 48 8d 7f 18 e8 4c 89 7d ff 48 89 df e8 94 f9 7d ff 41 54 9d <65> ff 0d c2 44 8d 48 5b 41 5c 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 [ 28.711354][ C16] RSP: 0018:888205b27bf8 EFLAGS: 0246 ORIG_RAX: ff13 [ 28.721372][ C16] RAX: RBX: 8882053d6138 RCX: b6f2a3b8 [ 28.731371][ C16] RDX: 111040a7ac27 RSI: dc00 RDI: 8882053d6138 [ 28.741371][ C16] RBP: 888205b27c08 R08: ed1040a7ac28 R09: ed1040a7ac27 [ 28.751334][ C16] R10: ed1040a7ac27 R11: 8882053d613b R12: 0246 [ 28.751370][ C16] R13: 888205b27c98 R14: 8884504d0a20 R15: [ 28.761368][ C16] FS: () GS:88845450() knlGS: [ 28.771373][ C16] CS: 0010 DS: ES: CR0: 80050033 [ 28.781334][ C16] CR2: CR3: 0007c9012000 CR4: 001406a0 [ 28.791333][ C16] Call Trace: [ 28.791374][ C16] klist_next+0xd8/0x1c0 [ 28.791374][ C16] subsys_find_device_by_id+0x13b/0x1f0 [ 28.801334][ C16] ? bus_find_device_by_name+0x20/0x20 [ 28.801370][ C16] ? kobject_put+0x23/0x250 [ 28.811333][ C16] walk_memory_blocks+0x6c/0xb8 [ 28.811353][ C16] ? write_policy_show+0x40/0x40 [ 28.821334][ C16] link_mem_sections+0x7e/0xa0 [ 28.821369][ C16] ? unregister_memory_block_under_nodes+0x210/0x210 [ 28.831353][ C16] ? __register_one_node+0x3bd/0x600 [ 28.831353][ C16] topology_init+0xbf/0x126 [ 28.841364][ C16] ? enable_cpu0_hotplug+0x1a/0x1a [ 28.841368][ C16]