Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On Tue, Sep 20, 2016 at 11:46:35AM -0700, David Daney wrote: > From: David Daney> > The wq_numa_init() function makes a private CPU to node map by calling > cpu_to_node() early in the boot process, before the non-boot CPUs are > brought online. Since the default implementation of cpu_to_node() > returns zero for CPUs that have never been brought online, the > workqueue system's view is that *all* CPUs are on node zero. > > When the unbound workqueue for a non-zero node is created, the > tsk_cpus_allowed() for the worker threads is the empty set because > there are, in the view of the workqueue system, no CPUs on non-zero > nodes. The code in try_to_wake_up() using this empty cpumask ends up > using the cpumask empty set value of NR_CPUS as an index into the > per-CPU area pointer array, and gets garbage as it is one past the end > of the array. This results in: Queued for 4.8. Thanks. -- Catalin
Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On Tue, Sep 20, 2016 at 11:46:35AM -0700, David Daney wrote: > From: David Daney > > The wq_numa_init() function makes a private CPU to node map by calling > cpu_to_node() early in the boot process, before the non-boot CPUs are > brought online. Since the default implementation of cpu_to_node() > returns zero for CPUs that have never been brought online, the > workqueue system's view is that *all* CPUs are on node zero. > > When the unbound workqueue for a non-zero node is created, the > tsk_cpus_allowed() for the worker threads is the empty set because > there are, in the view of the workqueue system, no CPUs on non-zero > nodes. The code in try_to_wake_up() using this empty cpumask ends up > using the cpumask empty set value of NR_CPUS as an index into the > per-CPU area pointer array, and gets garbage as it is one past the end > of the array. This results in: Queued for 4.8. Thanks. -- Catalin
Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On 2016/9/21 2:46, David Daney wrote: > From: David Daney> > Fix by moving call to numa_store_cpu_info() for all CPUs into > smp_prepare_cpus(), which happens before wq_numa_init(). Since > smp_store_cpu_info() now contains only a single function call, > simplify by removing the function and out-lining its contents. > > Suggested-by: Robert Richter > fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") > Cc: # 4.7.x- > Signed-off-by: David Daney > --- Tested-by: Yisheng Xie Thanks. > arch/arm64/kernel/smp.c | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index d93d433..3ff173e 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -201,12 +201,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) > return ret; > } > > -static void smp_store_cpu_info(unsigned int cpuid) > -{ > - store_cpu_topology(cpuid); > - numa_store_cpu_info(cpuid); > -} > - > /* > * This is the secondary CPU boot entry. We're using this CPUs > * idle thread stack, but a set of temporary page tables. > @@ -254,7 +248,7 @@ asmlinkage void secondary_start_kernel(void) >*/ > notify_cpu_starting(cpu); > > - smp_store_cpu_info(cpu); > + store_cpu_topology(cpu); > > /* >* OK, now it's safe to let the boot CPU continue. Wait for > @@ -689,10 +683,13 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > { > int err; > unsigned int cpu; > + unsigned int this_cpu; > > init_cpu_topology(); > > - smp_store_cpu_info(smp_processor_id()); > + this_cpu = smp_processor_id(); > + store_cpu_topology(this_cpu); > + numa_store_cpu_info(this_cpu); > > /* >* If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set > @@ -719,6 +716,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > continue; > > set_cpu_present(cpu, true); > + numa_store_cpu_info(cpu); > } > } > >
Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On 2016/9/21 2:46, David Daney wrote: > From: David Daney > > Fix by moving call to numa_store_cpu_info() for all CPUs into > smp_prepare_cpus(), which happens before wq_numa_init(). Since > smp_store_cpu_info() now contains only a single function call, > simplify by removing the function and out-lining its contents. > > Suggested-by: Robert Richter > fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") > Cc: # 4.7.x- > Signed-off-by: David Daney > --- Tested-by: Yisheng Xie Thanks. > arch/arm64/kernel/smp.c | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index d93d433..3ff173e 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -201,12 +201,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) > return ret; > } > > -static void smp_store_cpu_info(unsigned int cpuid) > -{ > - store_cpu_topology(cpuid); > - numa_store_cpu_info(cpuid); > -} > - > /* > * This is the secondary CPU boot entry. We're using this CPUs > * idle thread stack, but a set of temporary page tables. > @@ -254,7 +248,7 @@ asmlinkage void secondary_start_kernel(void) >*/ > notify_cpu_starting(cpu); > > - smp_store_cpu_info(cpu); > + store_cpu_topology(cpu); > > /* >* OK, now it's safe to let the boot CPU continue. Wait for > @@ -689,10 +683,13 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > { > int err; > unsigned int cpu; > + unsigned int this_cpu; > > init_cpu_topology(); > > - smp_store_cpu_info(smp_processor_id()); > + this_cpu = smp_processor_id(); > + store_cpu_topology(this_cpu); > + numa_store_cpu_info(this_cpu); > > /* >* If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set > @@ -719,6 +716,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > continue; > > set_cpu_present(cpu, true); > + numa_store_cpu_info(cpu); > } > } > >
Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On 20.09.16 11:46:35, David Daney wrote: > From: David Daney... > Fix by moving call to numa_store_cpu_info() for all CPUs into > smp_prepare_cpus(), which happens before wq_numa_init(). Since > smp_store_cpu_info() now contains only a single function call, > simplify by removing the function and out-lining its contents. > > Suggested-by: Robert Richter > fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") > Cc: # 4.7.x- > Signed-off-by: David Daney > --- > arch/arm64/kernel/smp.c | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) Looks good, your version properly initializes the boot cpu that was missing in my version. Reviewed-by: Robert Richter
Re: [PATCH] arm64: Call numa_store_cpu_info() earlier.
On 20.09.16 11:46:35, David Daney wrote: > From: David Daney ... > Fix by moving call to numa_store_cpu_info() for all CPUs into > smp_prepare_cpus(), which happens before wq_numa_init(). Since > smp_store_cpu_info() now contains only a single function call, > simplify by removing the function and out-lining its contents. > > Suggested-by: Robert Richter > fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") > Cc: # 4.7.x- > Signed-off-by: David Daney > --- > arch/arm64/kernel/smp.c | 14 ++ > 1 file changed, 6 insertions(+), 8 deletions(-) Looks good, your version properly initializes the boot cpu that was missing in my version. Reviewed-by: Robert Richter
[PATCH] arm64: Call numa_store_cpu_info() earlier.
From: David DaneyThe wq_numa_init() function makes a private CPU to node map by calling cpu_to_node() early in the boot process, before the non-boot CPUs are brought online. Since the default implementation of cpu_to_node() returns zero for CPUs that have never been brought online, the workqueue system's view is that *all* CPUs are on node zero. When the unbound workqueue for a non-zero node is created, the tsk_cpus_allowed() for the worker threads is the empty set because there are, in the view of the workqueue system, no CPUs on non-zero nodes. The code in try_to_wake_up() using this empty cpumask ends up using the cpumask empty set value of NR_CPUS as an index into the per-CPU area pointer array, and gets garbage as it is one past the end of the array. This results in: [0.881970] Unable to handle kernel paging request at virtual address fb1008b926a4 [1.970095] pgd = fc00094b [1.973530] [fb1008b926a4] *pgd=, *pud=, *pmd= [1.982610] Internal error: Oops: 9604 [#1] SMP [1.987541] Modules linked in: [1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: GW 4.8.0-rc6-preempt-vol+ #9 [1.999435] Hardware name: Cavium ThunderX CN88XX board (DT) [2.005159] task: fe0fe89cc300 task.stack: fe0fe8b8c000 [2.011158] PC is at try_to_wake_up+0x194/0x34c [2.015737] LR is at try_to_wake_up+0x150/0x34c [2.020318] pc : [] lr : [] pstate: 60c5 [2.027803] sp : fe0fe8b8fb10 [2.031149] x29: fe0fe8b8fb10 x28: [2.036522] x27: fc0008c63bc8 x26: 1000 [2.041896] x25: fc0008c63c80 x24: fc0008bfb200 [2.047270] x23: 00c0 x22: 0004 [2.052642] x21: fe0fe89d25bc x20: 1000 [2.058014] x19: fe0fe89d1d00 x18: [2.063386] x17: x16: [2.068760] x15: 0018 x14: [2.074133] x13: x12: [2.079505] x11: x10: [2.084879] x9 : x8 : [2.090251] x7 : 0040 x6 : [2.095621] x5 : x4 : [2.100991] x3 : x2 : [2.106364] x1 : fc0008be4c24 x0 : ff0ada80 [2.111737] [2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfe0fe8b8c020) [2.120102] Stack: (0xfe0fe8b8fb10 to 0xfe0fe8b9) [2.125914] fb00: fe0fe8b8fb80 fc00080e7648 . . . [2.442859] Call trace: [2.445327] Exception stack(0xfe0fe8b8f940 to 0xfe0fe8b8fa70) [2.451843] f940: fe0fe89d1d00 0400 fe0fe8b8fb10 fc00080e7468 [2.459767] f960: fe0fe8b8f980 fc00080e4958 ff0ff91ab200 fc00080e4b64 [2.467690] f980: fe0fe8b8f9d0 fc00080e515c fe0fe8b8fa80 [2.475614] f9a0: fe0fe8b8f9d0 fc00080e58e4 fe0fe8b8fa80 [2.483540] f9c0: fe0fe8d1 0040 fe0fe8b8fa50 fc00080e5ac4 [2.491465] f9e0: ff0ada80 fc0008be4c24 [2.499387] fa00: 0040 [2.507309] fa20: [2.515233] fa40: 0018 [2.523156] fa60: [2.528089] [] try_to_wake_up+0x194/0x34c [2.533723] [] wake_up_process+0x28/0x34 [2.539275] [] create_worker+0x110/0x19c [2.544824] [] alloc_unbound_pwq+0x3cc/0x4b0 [2.550724] [] wq_update_unbound_numa+0x10c/0x1e4 [2.557066] [] workqueue_online_cpu+0x220/0x28c [2.563234] [] cpuhp_invoke_callback+0x6c/0x168 [2.569398] [] cpuhp_up_callbacks+0x44/0xe4 [2.575210] [] cpuhp_thread_fun+0x13c/0x148 [2.581027] [] smpboot_thread_fn+0x19c/0x1a8 [2.586929] [] kthread+0xdc/0xf0 [2.591776] [] ret_from_fork+0x10/0x50 [2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821) [2.603464] ---[ end trace 58c0cd36b88802bc ]--- [2.608138] Kernel panic - not syncing: Fatal exception Fix by moving call to numa_store_cpu_info() for all CPUs into smp_prepare_cpus(), which happens before wq_numa_init(). Since smp_store_cpu_info() now contains only a single function call, simplify by removing the function and out-lining its contents. Suggested-by: Robert Richter fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") Cc: # 4.7.x- Signed-off-by: David Daney --- arch/arm64/kernel/smp.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kernel/smp.c
[PATCH] arm64: Call numa_store_cpu_info() earlier.
From: David Daney The wq_numa_init() function makes a private CPU to node map by calling cpu_to_node() early in the boot process, before the non-boot CPUs are brought online. Since the default implementation of cpu_to_node() returns zero for CPUs that have never been brought online, the workqueue system's view is that *all* CPUs are on node zero. When the unbound workqueue for a non-zero node is created, the tsk_cpus_allowed() for the worker threads is the empty set because there are, in the view of the workqueue system, no CPUs on non-zero nodes. The code in try_to_wake_up() using this empty cpumask ends up using the cpumask empty set value of NR_CPUS as an index into the per-CPU area pointer array, and gets garbage as it is one past the end of the array. This results in: [0.881970] Unable to handle kernel paging request at virtual address fb1008b926a4 [1.970095] pgd = fc00094b [1.973530] [fb1008b926a4] *pgd=, *pud=, *pmd= [1.982610] Internal error: Oops: 9604 [#1] SMP [1.987541] Modules linked in: [1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: GW 4.8.0-rc6-preempt-vol+ #9 [1.999435] Hardware name: Cavium ThunderX CN88XX board (DT) [2.005159] task: fe0fe89cc300 task.stack: fe0fe8b8c000 [2.011158] PC is at try_to_wake_up+0x194/0x34c [2.015737] LR is at try_to_wake_up+0x150/0x34c [2.020318] pc : [] lr : [] pstate: 60c5 [2.027803] sp : fe0fe8b8fb10 [2.031149] x29: fe0fe8b8fb10 x28: [2.036522] x27: fc0008c63bc8 x26: 1000 [2.041896] x25: fc0008c63c80 x24: fc0008bfb200 [2.047270] x23: 00c0 x22: 0004 [2.052642] x21: fe0fe89d25bc x20: 1000 [2.058014] x19: fe0fe89d1d00 x18: [2.063386] x17: x16: [2.068760] x15: 0018 x14: [2.074133] x13: x12: [2.079505] x11: x10: [2.084879] x9 : x8 : [2.090251] x7 : 0040 x6 : [2.095621] x5 : x4 : [2.100991] x3 : x2 : [2.106364] x1 : fc0008be4c24 x0 : ff0ada80 [2.111737] [2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfe0fe8b8c020) [2.120102] Stack: (0xfe0fe8b8fb10 to 0xfe0fe8b9) [2.125914] fb00: fe0fe8b8fb80 fc00080e7648 . . . [2.442859] Call trace: [2.445327] Exception stack(0xfe0fe8b8f940 to 0xfe0fe8b8fa70) [2.451843] f940: fe0fe89d1d00 0400 fe0fe8b8fb10 fc00080e7468 [2.459767] f960: fe0fe8b8f980 fc00080e4958 ff0ff91ab200 fc00080e4b64 [2.467690] f980: fe0fe8b8f9d0 fc00080e515c fe0fe8b8fa80 [2.475614] f9a0: fe0fe8b8f9d0 fc00080e58e4 fe0fe8b8fa80 [2.483540] f9c0: fe0fe8d1 0040 fe0fe8b8fa50 fc00080e5ac4 [2.491465] f9e0: ff0ada80 fc0008be4c24 [2.499387] fa00: 0040 [2.507309] fa20: [2.515233] fa40: 0018 [2.523156] fa60: [2.528089] [] try_to_wake_up+0x194/0x34c [2.533723] [] wake_up_process+0x28/0x34 [2.539275] [] create_worker+0x110/0x19c [2.544824] [] alloc_unbound_pwq+0x3cc/0x4b0 [2.550724] [] wq_update_unbound_numa+0x10c/0x1e4 [2.557066] [] workqueue_online_cpu+0x220/0x28c [2.563234] [] cpuhp_invoke_callback+0x6c/0x168 [2.569398] [] cpuhp_up_callbacks+0x44/0xe4 [2.575210] [] cpuhp_thread_fun+0x13c/0x148 [2.581027] [] smpboot_thread_fn+0x19c/0x1a8 [2.586929] [] kthread+0xdc/0xf0 [2.591776] [] ret_from_fork+0x10/0x50 [2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821) [2.603464] ---[ end trace 58c0cd36b88802bc ]--- [2.608138] Kernel panic - not syncing: Fatal exception Fix by moving call to numa_store_cpu_info() for all CPUs into smp_prepare_cpus(), which happens before wq_numa_init(). Since smp_store_cpu_info() now contains only a single function call, simplify by removing the function and out-lining its contents. Suggested-by: Robert Richter fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") Cc: # 4.7.x- Signed-off-by: David Daney --- arch/arm64/kernel/smp.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index d93d433..3ff173e 100644 --- a/arch/arm64/kernel/smp.c +++