[Kgdb-bugreport] [PATCH v12 5/7] arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI
There's no reason why IPI_CPU_STOP and IPI_CPU_CRASH_STOP can't be handled as NMI. They are very simple and everything in them is NMI-safe. Mark them as things to use NMI for if NMI is available. Suggested-by: Mark Rutland Reviewed-by: Stephen Boyd Reviewed-by: Misono Tomohiro Reviewed-by: Sumit Garg Signed-off-by: Douglas Anderson --- I don't actually have any good way to test/validate this patch. It's added to the series at Mark's request. (no changes since v10) Changes in v10: - ("IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") new for v10. arch/arm64/kernel/smp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 28c904ca499a..800c59cf9b64 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -946,6 +946,8 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi) return false; switch (ipi) { + case IPI_CPU_STOP: + case IPI_CPU_CRASH_STOP: case IPI_CPU_BACKTRACE: return true; default: -- 2.42.0.283.g2d96d420d3-goog ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
[Kgdb-bugreport] [PATCH v12 7/7] arm64: smp: Mark IPI globals as __ro_after_init
Mark the three IPI-related globals in smp.c as "__ro_after_init" since they are only ever set in set_smp_ipi_range(), which is marked "__init". This is a better and more secure marking than the old "__read_mostly". Suggested-by: Stephen Boyd Signed-off-by: Douglas Anderson --- This patch is almost completely unrelated to the rest of the series other than the fact that it would cause a merge conflict with the series if sent separately. I tacked it on to this series in response to Stephen's feedback on v11 of this series [1]. If someone hates it (not sure why they would), it could be dropped. If someone loves it, it could be promoted to the start of the series and/or land on its own (resolving merge conflicts). [1] https://lore.kernel.org/r/cae-0n52ivdgza8xt8ktmj12c_essjt7f7a0fuz_oammqpgc...@mail.gmail.com Changes in v12: - ("arm64: smp: Mark IPI globals as __ro_after_init") new for v12. arch/arm64/kernel/smp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 1a53e57c81d0..814d9aa93b21 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -84,9 +84,9 @@ enum ipi_msg_type { MAX_IPI }; -static int ipi_irq_base __read_mostly; -static int nr_ipi __read_mostly = NR_IPI; -static struct irq_desc *ipi_desc[MAX_IPI] __read_mostly; +static int ipi_irq_base __ro_after_init; +static int nr_ipi __ro_after_init = NR_IPI; +static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init; static void ipi_setup(int cpu); -- 2.42.0.283.g2d96d420d3-goog ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
[Kgdb-bugreport] [PATCH v12 4/7] arm64: smp: Add arch support for backtrace using pseudo-NMI
Enable arch_trigger_cpumask_backtrace() support on arm64. This enables things much like they are enabled on arm32 (including some of the funky logic around NR_IPI, nr_ipi, and MAX_IPI) but with the difference that, unlike arm32, we'll try to enable the backtrace to use pseudo-NMI. NOTE: this patch is a squash of the little bit of code adding the ability to mark an IPI to try to use pseudo-NMI plus the little bit of code to hook things up for kgdb. This approach was decided upon in the discussion of v9 [1]. This patch depends on commit 8d539b84f1e3 ("nmi_backtrace: allow excluding an arbitrary CPU") since that commit changed the prototype of arch_trigger_cpumask_backtrace(), which this patch implements. [1] https://lore.kernel.org/r/ZORY51mF4alI41G1@FVFF77S0Q05N Co-developed-by: Sumit Garg Signed-off-by: Sumit Garg Co-developed-by: Mark Rutland Signed-off-by: Mark Rutland Reviewed-by: Stephen Boyd Reviewed-by: Misono Tomohiro Signed-off-by: Douglas Anderson --- Changes in v12: - Minor comment change to add "()" after nmi_trigger_cpumask_backtrace. - Updated the commit hash of the commit this depends on. Changes in v11: - Adjust comment about NR_IPI/MAX_IPI. - Don't use confusing "backed by" idiom in comment. - Made arm64_backtrace_ipi() static. Changes in v10: - Backtrace now directly supported in smp.c - Squash backtrace into patch adding support for pseudo-NMI IPIs. Changes in v9: - Added comments that we might not be using NMI always. - Fold in v8 patch #10 ("Fallback to a regular IPI if NMI isn't enabled") - Moved header file out of "include" since it didn't need to be there. - Remove arm64_supports_nmi() - Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI. - arch_trigger_cpumask_backtrace() no longer returns bool Changes in v8: - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP - debug_ipi_setup() and debug_ipi_teardown() no longer take cpu param arch/arm64/include/asm/irq.h | 3 ++ arch/arm64/kernel/smp.c | 86 +++- 2 files changed, 78 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h index fac08e18bcd5..50ce8b697ff3 100644 --- a/arch/arm64/include/asm/irq.h +++ b/arch/arm64/include/asm/irq.h @@ -6,6 +6,9 @@ #include +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu); +#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace + struct pt_regs; int set_handle_irq(void (*handle_irq)(struct pt_regs *)); diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index a5848f1ef817..28c904ca499a 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -72,12 +73,18 @@ enum ipi_msg_type { IPI_CPU_CRASH_STOP, IPI_TIMER, IPI_IRQ_WORK, - NR_IPI + NR_IPI, + /* +* Any enum >= NR_IPI and < MAX_IPI is special and not tracable +* with trace_ipi_* +*/ + IPI_CPU_BACKTRACE = NR_IPI, + MAX_IPI }; static int ipi_irq_base __read_mostly; static int nr_ipi __read_mostly = NR_IPI; -static struct irq_desc *ipi_desc[NR_IPI] __read_mostly; +static struct irq_desc *ipi_desc[MAX_IPI] __read_mostly; static void ipi_setup(int cpu); @@ -845,6 +852,22 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs #endif } +static void arm64_backtrace_ipi(cpumask_t *mask) +{ + __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); +} + +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) +{ + /* +* NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name, +* nothing about it truly needs to be implemented using an NMI, it's +* just that it's _allowed_ to work with NMIs. If ipi_should_be_nmi() +* returned false our backtrace attempt will just use a regular IPI. +*/ + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, arm64_backtrace_ipi); +} + /* * Main handler for inter-processor interrupts */ @@ -888,6 +911,14 @@ static void do_handle_IPI(int ipinr) break; #endif + case IPI_CPU_BACKTRACE: + /* +* NOTE: in some cases this _won't_ be NMI context. See the +* comment in arch_trigger_cpumask_backtrace(). +*/ + nmi_cpu_backtrace(get_irq_regs()); + break; + default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); break; @@ -909,6 +940,19 @@ static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) __ipi_send_mask(ipi_desc[ipinr], target); } +static bool ipi_should_be_nmi(enum ipi_msg_type ipi) +{ + if (!system_uses_irq_prio_masking()) + return false; + + switch (ipi) { + case IPI_CPU_BACKTRACE: + return true; +
[Kgdb-bugreport] [PATCH v12 6/7] arm64: kgdb: Implement kgdb_roundup_cpus() to enable pseudo-NMI roundup
Up until now we've been using the generic (weak) implementation for kgdb_roundup_cpus() when using kgdb on arm64. Let's move to a custom one. The advantage here is that, when pseudo-NMI is enabled on a device, we'll be able to round up CPUs using pseudo-NMI. This allows us to debug CPUs that are stuck with interrupts disabled. If pseudo-NMIs are not enabled then we'll fallback to just using an IPI, which is still slightly better than the generic implementation since it avoids the potential situation described in the generic kgdb_call_nmi_hook(). Co-developed-by: Sumit Garg Signed-off-by: Sumit Garg Reviewed-by: Daniel Thompson Reviewed-by: Stephen Boyd Signed-off-by: Douglas Anderson --- I debated whether this should be in "arch/arm64/kernel/smp.c" or if I should try to find a way for it to go into "arch/arm64/kernel/kgdb.c". In the end this is so little code that it didn't seem worth it to find a way to export the IPI defines or to otherwise come up with some API between kgdb.c and smp.c. If someone has strong feelings and wants this to change, please shout and give details of your preferred solution. FWIW, it seems like ~half the other platforms put this in "smp.c" with an ifdef for KGDB and the other half put it in "kgdb.c" with an ifdef for SMP. :-P (no changes since v10) Changes in v10: - Don't allocate the cpumask on the stack; just iterate. - Moved kgdb calls to smp.c to avoid needing to export IPI info. - kgdb now has its own IPI. Changes in v9: - Remove fallback for when debug IPI isn't available. - Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI. arch/arm64/kernel/smp.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 800c59cf9b64..1a53e57c81d0 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -79,6 +80,7 @@ enum ipi_msg_type { * with trace_ipi_* */ IPI_CPU_BACKTRACE = NR_IPI, + IPI_KGDB_ROUNDUP, MAX_IPI }; @@ -868,6 +870,22 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) nmi_trigger_cpumask_backtrace(mask, exclude_cpu, arm64_backtrace_ipi); } +#ifdef CONFIG_KGDB +void kgdb_roundup_cpus(void) +{ + int this_cpu = raw_smp_processor_id(); + int cpu; + + for_each_online_cpu(cpu) { + /* No need to roundup ourselves */ + if (cpu == this_cpu) + continue; + + __ipi_send_single(ipi_desc[IPI_KGDB_ROUNDUP], cpu); + } +} +#endif + /* * Main handler for inter-processor interrupts */ @@ -919,6 +937,10 @@ static void do_handle_IPI(int ipinr) nmi_cpu_backtrace(get_irq_regs()); break; + case IPI_KGDB_ROUNDUP: + kgdb_nmicallback(cpu, get_irq_regs()); + break; + default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); break; @@ -949,6 +971,7 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi) case IPI_CPU_STOP: case IPI_CPU_CRASH_STOP: case IPI_CPU_BACKTRACE: + case IPI_KGDB_ROUNDUP: return true; default: return false; -- 2.42.0.283.g2d96d420d3-goog ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
[Kgdb-bugreport] [PATCH v12 2/7] arm64: idle: Tag the arm64 idle functions as __cpuidle
As per the (somewhat recent) comment before the definition of `__cpuidle`, the tag is like `noinstr` but also marks a function so it can be identified by cpu_in_idle(). Let's add these markings to arm64 cpuidle functions With this change we get useful backtraces like: NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98 instead of useless backtraces when dumping all processors using nmi_cpu_backtrace(). NOTE: this patch won't make cpu_in_idle() work perfectly for arm64, but it doesn't hurt and does catch some cases. Specifically an example that wasn't caught in my testing looked like this: gic_cpu_sys_reg_init+0x1f8/0x314 gic_cpu_pm_notifier+0x40/0x78 raw_notifier_call_chain+0x5c/0x134 cpu_pm_notify+0x38/0x64 cpu_pm_exit+0x20/0x2c psci_enter_idle_state+0x48/0x70 cpuidle_enter_state+0xb8/0x260 cpuidle_enter+0x44/0x5c do_idle+0x188/0x30c Acked-by: Mark Rutland Reviewed-by: Stephen Boyd Acked-by: Sumit Garg Signed-off-by: Douglas Anderson --- (no changes since v11) Changes in v11: - Updated commit message as per Stephen. Changes in v9: - Added to commit message that this doesn't catch all cases. Changes in v8: - "Tag the arm64 idle functions as __cpuidle" new for v8 arch/arm64/kernel/idle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c index c1125753fe9b..05cfb347ec26 100644 --- a/arch/arm64/kernel/idle.c +++ b/arch/arm64/kernel/idle.c @@ -20,7 +20,7 @@ * ensure that interrupts are not masked at the PMR (because the core will * not wake up if we block the wake up signal in the interrupt controller). */ -void noinstr cpu_do_idle(void) +void __cpuidle cpu_do_idle(void) { struct arm_cpuidle_irq_context context; @@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void) /* * This is our default idle handler. */ -void noinstr arch_cpu_idle(void) +void __cpuidle arch_cpu_idle(void) { /* * This should do all the clock switching and wait for interrupt -- 2.42.0.283.g2d96d420d3-goog ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
[Kgdb-bugreport] [PATCH v12 0/7] arm64: Add IPI for backtraces / kgdb; try to use NMI for some IPIs
This is an attempt to resurrect Sumit's old patch series [1] that allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and also to round up CPUs in kdb/kgdb. The last post from Sumit that I could find was v7, so I started my series at v8. I haven't copied all of his old changelongs here, but you can find them from the link. This patch series targets v6.6. Specifically it can't land in v6.5 since it depends on commit 8d539b84f1e3 ("nmi_backtrace: allow excluding an arbitrary CPU"). It should be noted that Mark still feels there might be some corner cases where pseudo-NMI is not production ready [2] [3], but as far as I'm aware there are no concrete/documented issues. Regardless of whether this should be enabled for production, though, this series will be invaluable to anyone trying to debug crashes on arm64 machines. v12 of this series collects tags, fixes a few small nits in comments and commit messages from v11 and adds a new (and somewhat unrelated) small patch to the end of the series. There are no code changes other than the last patch, which is tiny. v11 of this series addressed Stephen Boyd's feedback on v10 and added a missing "static" that the patches robot found. v10 of this series attempted to address all of Mark's feedback on v9. As a quick summary: - It includes his patch to remove IPI_WAKEUP, freeing up an extra IPI. - It no longer combines the "kgdb" and "backtrace" IPIs. If we need another IPI these could always be recombined later. - It promotes IPI_CPU_STOP and IPI_CPU_CRASH_STOP to NMI. - It puts nearly all the code directly in smp.c. - Several of the patches are squashed together. - Patch #6 ("kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB") was dropped from the series since it landed. Between v8 and v9, I had cleaned up this patch series by integrating the 10th patch from v8 [4] into the whole series. As part of this, I renamed the "NMI IPI" to the "debug IPI" since it could now be backed by a regular IPI in the case that pseudo NMIs weren't available. With the fallback, this allowed me to drop some extra patches from the series. This feels (to me) to be pretty clean and hopefully others agree. Any patch I touched significantly I removed Masayoshi and Chen-Yu's tags from. ...also in v8, I reorderd the patches a bit in a way that seemed a little cleaner to me. Since v7, I have: * Addressed the small amount of feedback that was there for v7. * Rebased. * Added a new patch that prevents us from spamming the logs with idle tasks. * Added an extra patch to gracefully fall back to regular IPIs if pseudo-NMIs aren't there. It can be noted that this patch series works very well with the recent "hardlockup" patches that have landed through Andrew Morton's tree and are currently in mainline. It works especially well with the "buddy" lockup detector. [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.g...@linaro.org/ [2] https://lore.kernel.org/lkml/zfvgqd%2f%2fpm%2flz...@fvff77s0q05n.cambridge.arm.com/ [3] https://lore.kernel.org/lkml/zndkvp2m-iizc...@fvff77s0q05n.cambridge.arm.com [4] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid/ Changes in v12: - ("arm64: smp: Mark IPI globals as __ro_after_init") new for v12. - Added a comment about why we account for 16 SGIs when Linux uses 8. - Minor comment change to add "()" after nmi_trigger_cpumask_backtrace. - Updated the commit hash of the commit this depends on. Changes in v11: - Adjust comment about NR_IPI/MAX_IPI. - Don't use confusing "backed by" idiom in comment. - Made arm64_backtrace_ipi() static. - Updated commit message as per Stephen. - arch_send_wakeup_ipi() now takes an unsigned int. Changes in v10: - ("IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") new for v10. - ("arm64: smp: Remove dedicated wakeup IPI") new for v10. - Backtrace now directly supported in smp.c - Don't allocate the cpumask on the stack; just iterate. - Moved kgdb calls to smp.c to avoid needing to export IPI info. - Rewrite as needed for 5.11+ as per Mark Rutland and Sumit. - Squash backtrace into patch adding support for pseudo-NMI IPIs. - kgdb now has its own IPI. Changes in v9: - Added comments that we might not be using NMI always. - Added to commit message that this doesn't catch all cases. - Fold in v8 patch #10 ("Fallback to a regular IPI if NMI isn't enabled") - Moved header file out of "include" since it didn't need to be there. - Remove arm64_supports_nmi() - Remove fallback for when debug IPI isn't available. - Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI. - arch_trigger_cpumask_backtrace() no longer returns bool Changes in v8: - "Tag the arm64 idle functions as __cpuidle" new for v8 - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP - debug_ipi_setup() and debug_ipi_teardown() no longer take cpu param Douglas Anderson (6): irqchip/gic-v3: Enable support for SGIs to act as NMIs arm64:
[Kgdb-bugreport] [PATCH v12 3/7] arm64: smp: Remove dedicated wakeup IPI
From: Mark Rutland To enable NMI backtrace and KGDB's NMI cpu roundup, we need to free up at least one dedicated IPI. On arm64 the IPI_WAKEUP IPI is only used for the ACPI parking protocol, which itself is only used on some very early ARMv8 systems which couldn't implement PSCI. Remove the IPI_WAKEUP IPI, and rely on the IPI_RESCHEDULE IPI to wake CPUs from the parked state. This will cause a tiny amonut of redundant work to check the thread flags, but this is miniscule in relation to the cost of taking and handling the IPI in the first place. We can safely handle redundant IPI_RESCHEDULE IPIs, so there should be no functional impact as a result of this change. Signed-off-by: Mark Rutland Reviewed-by: Stephen Boyd Reviewed-by: Sumit Garg Signed-off-by: Douglas Anderson Cc: Catalin Marinas Cc: Marc Zyngier Cc: Will Deacon --- I have no idea how to test this. I just took Mark's patch and jammed it into my series. Logicially the patch seems reasonable to me. (no changes since v11) Changes in v11: - arch_send_wakeup_ipi() now takes an unsigned int. Changes in v10: - ("arm64: smp: Remove dedicated wakeup IPI") new for v10. arch/arm64/include/asm/smp.h | 4 ++-- arch/arm64/kernel/acpi_parking_protocol.c | 2 +- arch/arm64/kernel/smp.c | 28 +-- 3 files changed, 14 insertions(+), 20 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 9b31e6d0da17..efb13112b408 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -89,9 +89,9 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); #ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL -extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask); +extern void arch_send_wakeup_ipi(unsigned int cpu); #else -static inline void arch_send_wakeup_ipi_mask(const struct cpumask *mask) +static inline void arch_send_wakeup_ipi(unsigned int cpu) { BUILD_BUG(); } diff --git a/arch/arm64/kernel/acpi_parking_protocol.c b/arch/arm64/kernel/acpi_parking_protocol.c index b1990e38aed0..e1be29e608b7 100644 --- a/arch/arm64/kernel/acpi_parking_protocol.c +++ b/arch/arm64/kernel/acpi_parking_protocol.c @@ -103,7 +103,7 @@ static int acpi_parking_protocol_cpu_boot(unsigned int cpu) >entry_point); writel_relaxed(cpu_entry->gic_cpu_id, >cpu_id); - arch_send_wakeup_ipi_mask(cpumask_of(cpu)); + arch_send_wakeup_ipi(cpu); return 0; } diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 960b98b43506..a5848f1ef817 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -72,7 +72,6 @@ enum ipi_msg_type { IPI_CPU_CRASH_STOP, IPI_TIMER, IPI_IRQ_WORK, - IPI_WAKEUP, NR_IPI }; @@ -764,7 +763,6 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = { [IPI_CPU_CRASH_STOP]= "CPU stop (for crash dump) interrupts", [IPI_TIMER] = "Timer broadcast interrupts", [IPI_IRQ_WORK] = "IRQ work interrupts", - [IPI_WAKEUP]= "CPU wake-up interrupts", }; static void smp_cross_call(const struct cpumask *target, unsigned int ipinr); @@ -797,13 +795,6 @@ void arch_send_call_function_single_ipi(int cpu) smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC); } -#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL -void arch_send_wakeup_ipi_mask(const struct cpumask *mask) -{ - smp_cross_call(mask, IPI_WAKEUP); -} -#endif - #ifdef CONFIG_IRQ_WORK void arch_irq_work_raise(void) { @@ -897,14 +888,6 @@ static void do_handle_IPI(int ipinr) break; #endif -#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL - case IPI_WAKEUP: - WARN_ONCE(!acpi_parking_protocol_valid(cpu), - "CPU%u: Wake-up IPI outside the ACPI parking protocol\n", - cpu); - break; -#endif - default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); break; @@ -979,6 +962,17 @@ void arch_smp_send_reschedule(int cpu) smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE); } +#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL +void arch_send_wakeup_ipi(unsigned int cpu) +{ + /* +* We use a scheduler IPI to wake the CPU as this avoids the need for a +* dedicated IPI and we can safely handle spurious scheduler IPIs. +*/ + arch_smp_send_reschedule(cpu); +} +#endif + #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST void tick_broadcast(const struct cpumask *mask) { -- 2.42.0.283.g2d96d420d3-goog ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
[Kgdb-bugreport] [PATCH v12 1/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs
As of commit 6abbd6988971 ("irqchip/gic, gic-v3: Make SGIs use handle_percpu_devid_irq()") SGIs are treated the same as PPIs/EPPIs and use handle_percpu_devid_irq() by default. Unfortunately, handle_percpu_devid_irq() isn't NMI safe, and so to run in an NMI context those should use handle_percpu_devid_fasteoi_nmi(). In order to accomplish this, we just have to make room for SGIs in the array of refcounts that keeps track of which interrupts are set as NMI. We also rename the array and create a new indexing scheme that accounts for SGIs. Also, enable NMI support prior to gic_smp_init() as allocation of SGIs as IRQs/NMIs happen as part of this routine. Co-developed-by: Sumit Garg Signed-off-by: Sumit Garg Signed-off-by: Douglas Anderson --- I'll note that this change is a little more black magic to me than others in this series. I don't have a massive amounts of familiarity with all the moving parts of gic-v3, so I mostly just followed Mark Rutland's advice [1]. Please pay extra attention to make sure I didn't do anything too terrible. Mark's advice wasn't a full patch and I ended up doing a bit of work to translate it to reality, so I did not add him as "Co-developed-by" here. Mark: if you would like this tag then please provide it and your Signed-off-by. I certainly won't object. [1] https://lore.kernel.org/r/znc-yrqopo0pa...@fvff77s0q05n.cambridge.arm.com Changes in v12: - Added a comment about why we account for 16 SGIs when Linux uses 8. Changes in v10: - Rewrite as needed for 5.11+ as per Mark Rutland and Sumit. drivers/irqchip/irq-gic-v3.c | 59 +--- 1 file changed, 41 insertions(+), 18 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index eedfa8e9f077..8d20122ba0a8 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -78,6 +78,13 @@ static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key); #define GIC_LINE_NRmin(GICD_TYPER_SPIS(gic_data.rdists.gicd_typer), 1020U) #define GIC_ESPI_NRGICD_TYPER_ESPIS(gic_data.rdists.gicd_typer) +/* + * There are 16 SGIs, though we only actually use 8 in Linux. The other 8 SGIs + * are potentially stolen by the secure side. Some code, especially code dealing + * with hwirq IDs, is simplified by accounting for all 16. + */ +#define SGI_NR 16 + /* * The behaviours of RPR and PMR registers differ depending on the value of * SCR_EL3.FIQ, and the behaviour of non-secure priority registers of the @@ -125,8 +132,8 @@ EXPORT_SYMBOL(gic_nonsecure_priorities); __priority; \ }) -/* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ -static refcount_t *ppi_nmi_refs; +/* rdist_nmi_refs[n] == number of cpus having the rdist interrupt n set as NMI */ +static refcount_t *rdist_nmi_refs; static struct gic_kvm_info gic_v3_kvm_info __initdata; static DEFINE_PER_CPU(bool, has_rss); @@ -519,9 +526,22 @@ static u32 __gic_get_ppi_index(irq_hw_number_t hwirq) } } -static u32 gic_get_ppi_index(struct irq_data *d) +static u32 __gic_get_rdist_idx(irq_hw_number_t hwirq) +{ + switch (__get_intid_range(hwirq)) { + case SGI_RANGE: + case PPI_RANGE: + return hwirq; + case EPPI_RANGE: + return hwirq - EPPI_BASE_INTID + 32; + default: + unreachable(); + } +} + +static u32 gic_get_rdist_idx(struct irq_data *d) { - return __gic_get_ppi_index(d->hwirq); + return __gic_get_rdist_idx(d->hwirq); } static int gic_irq_nmi_setup(struct irq_data *d) @@ -545,11 +565,14 @@ static int gic_irq_nmi_setup(struct irq_data *d) /* desc lock should already be held */ if (gic_irq_in_rdist(d)) { - u32 idx = gic_get_ppi_index(d); + u32 idx = gic_get_rdist_idx(d); - /* Setting up PPI as NMI, only switch handler for first NMI */ - if (!refcount_inc_not_zero(_nmi_refs[idx])) { - refcount_set(_nmi_refs[idx], 1); + /* +* Setting up a percpu interrupt as NMI, only switch handler +* for first NMI +*/ + if (!refcount_inc_not_zero(_nmi_refs[idx])) { + refcount_set(_nmi_refs[idx], 1); desc->handle_irq = handle_percpu_devid_fasteoi_nmi; } } else { @@ -582,10 +605,10 @@ static void gic_irq_nmi_teardown(struct irq_data *d) /* desc lock should already be held */ if (gic_irq_in_rdist(d)) { - u32 idx = gic_get_ppi_index(d); + u32 idx = gic_get_rdist_idx(d); /* Tearing down NMI, only switch handler for last NMI */ - if (refcount_dec_and_test(_nmi_refs[idx])) + if (refcount_dec_and_test(_nmi_refs[idx])) desc->handle_irq = handle_percpu_devid_irq; } else {
Re: [Kgdb-bugreport] [PATCH] kgdb: Flush console before entering kgdb on panic
On Fri, Aug 25, 2023 at 07:18:44AM -0700, Doug Anderson wrote: > Hi, > > On Fri, Aug 25, 2023 at 3:09 AM Daniel Thompson > wrote: > > > > On Tue, Aug 22, 2023 at 01:19:46PM -0700, Douglas Anderson wrote: > > > When entering kdb/kgdb on a kernel panic, it was be observed that the > > > console isn't flushed before the `kdb` prompt came up. Specifically, > > > when using the buddy lockup detector on arm64 and running: > > > echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT > > > > > > I could see: > > > [ 26.161099] lkdtm: Performing direct entry HARDLOCKUP > > > [ 32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6 > > > [ 32.552865] Sending NMI from CPU 5 to CPUs 6: > > > [ 32.557359] NMI backtrace for cpu 6 > > > ... [backtrace for cpu 6] ... > > > [ 32.558353] NMI backtrace for cpu 5 > > > ... [backtrace for cpu 5] ... > > > [ 32.867471] Sending NMI from CPU 5 to CPUs 0-4,7: > > > [ 32.872321] NMI backtrace forP cpuANC: Hard LOCKUP > > > > > > Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry > > > [5]kdb> > > > > > > As you can see, backtraces for the other CPUs start printing and get > > > interleaved with the kdb PANIC print. > > > > > > Let's replicate the commands to flush the console in the kdb panic > > > entry point to avoid this. > > > > > > Signed-off-by: Douglas Anderson > > > --- > > > > > > kernel/debug/debug_core.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c > > > index d5e9ccde3ab8..3a904d8697c8 100644 > > > --- a/kernel/debug/debug_core.c > > > +++ b/kernel/debug/debug_core.c > > > @@ -1006,6 +1006,9 @@ void kgdb_panic(const char *msg) > > > if (panic_timeout) > > > return; > > > > > > + debug_locks_off(); > > > + console_flush_on_panic(CONSOLE_FLUSH_PENDING); > > > + > > > if (dbg_kdb_mode) > > > kdb_printf("PANIC: %s\n", msg); > > > > I'm somewhat included to say *this* (calling kdb_printf() when not > > actually in the debugger) is the cause of the problem. kdb_printf() > > does some pretty horid things to the console and isn't intended to > > run while the system is active. > > > > I'd therefore be more tempted to defer the print to the b.p. trap > > handler itself and make this part of kgdb_panic() look more like: > > > > kgdb_panic_msg = msg; > > kgdb_breakpoint(); > > kgdb_panic_msg = NULL; > > Unfortunately I think that only solves half the problem. As a quick > test, I tried simply commenting out the "kdb_printf" line in > kgdb_panic(). While that avoids the interleaved panic message and > backtrace, it does nothing to actually get the backtraces printed out > before you end up in kdb. As an example, this is what happened when I > used `echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT` and > had the "kdb_printf" in kgdb_panic() commented out: > > [ 72.658424] lkdtm: Performing direct entry HARDLOCKUP > [ 82.181857] watchdog: Watchdog detected hard LOCKUP on cpu 6 > ... > [ 82.234801] Sending NMI from CPU 5 to CPUs 6: > [ 82.239296] NMI backtrace for cpu 6 > ... [ stack trace for CPU 6 ] ... > [ 82.240294] NMI backtrace for cpu 5 > ... [ stack trace for CPU 5 ] ... > [ 82.576443] Sending NMI from CPU 5 to CPUs 0-4,7: > [ 82.581291] NMI backtrace > Entering kdb (current=0xff80da5a1080, pid 6978) on processor 5 due > to Keyboard Entry > [5]kdb> > > As you can see, I don't see the traces for CPUs 0-4 and 7. Those do > show up if I use the "dmesg" command but it's a bit of a hassle to run > "dmesg" to look for any extra debug messages every time I drop in kdb. > > I guess perhaps that part isn't obvious from the commit message? I figured it was a risk. In fact it's an area where my instinct to honour console messages and my instinct to get into the kernel as soon as possible after the decision to invoke it has been made come into conflict. In other words does it matter that the console buffers are not flushed before entering kgdb? However having thought about it for a little while (and knowing the console code tends to be written to be decently robust) I can come to the view the flushing is best. > Should I send a new version with an updated commit message indicating > that it's not just the jumbled text that's a problem but also the lack > of stack traces? No real need. I don't really like seeing kdb_printf() being called from here but having reviewed a bit of console code I think we can might be able to use the new infrastructure to make kdb_printf() a slightly less hateful ;-). Daniel. ___ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport