[Kgdb-bugreport] [PATCH v12 5/7] arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI

2023-08-30 Thread Douglas Anderson
There's no reason why IPI_CPU_STOP and IPI_CPU_CRASH_STOP can't be
handled as NMI. They are very simple and everything in them is
NMI-safe. Mark them as things to use NMI for if NMI is available.

Suggested-by: Mark Rutland 
Reviewed-by: Stephen Boyd 
Reviewed-by: Misono Tomohiro 
Reviewed-by: Sumit Garg 
Signed-off-by: Douglas Anderson 
---
I don't actually have any good way to test/validate this patch. It's
added to the series at Mark's request.

(no changes since v10)

Changes in v10:
- ("IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") new for v10.

 arch/arm64/kernel/smp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 28c904ca499a..800c59cf9b64 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -946,6 +946,8 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi)
return false;
 
switch (ipi) {
+   case IPI_CPU_STOP:
+   case IPI_CPU_CRASH_STOP:
case IPI_CPU_BACKTRACE:
return true;
default:
-- 
2.42.0.283.g2d96d420d3-goog



___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport


[Kgdb-bugreport] [PATCH v12 7/7] arm64: smp: Mark IPI globals as __ro_after_init

2023-08-30 Thread Douglas Anderson
Mark the three IPI-related globals in smp.c as "__ro_after_init" since
they are only ever set in set_smp_ipi_range(), which is marked
"__init". This is a better and more secure marking than the old
"__read_mostly".

Suggested-by: Stephen Boyd 
Signed-off-by: Douglas Anderson 
---
This patch is almost completely unrelated to the rest of the series
other than the fact that it would cause a merge conflict with the
series if sent separately. I tacked it on to this series in response
to Stephen's feedback on v11 of this series [1]. If someone hates it
(not sure why they would), it could be dropped. If someone loves it,
it could be promoted to the start of the series and/or land on its own
(resolving merge conflicts).

[1] 
https://lore.kernel.org/r/cae-0n52ivdgza8xt8ktmj12c_essjt7f7a0fuz_oammqpgc...@mail.gmail.com

Changes in v12:
- ("arm64: smp: Mark IPI globals as __ro_after_init") new for v12.

 arch/arm64/kernel/smp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 1a53e57c81d0..814d9aa93b21 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -84,9 +84,9 @@ enum ipi_msg_type {
MAX_IPI
 };
 
-static int ipi_irq_base __read_mostly;
-static int nr_ipi __read_mostly = NR_IPI;
-static struct irq_desc *ipi_desc[MAX_IPI] __read_mostly;
+static int ipi_irq_base __ro_after_init;
+static int nr_ipi __ro_after_init = NR_IPI;
+static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init;
 
 static void ipi_setup(int cpu);
 
-- 
2.42.0.283.g2d96d420d3-goog



___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport


[Kgdb-bugreport] [PATCH v12 4/7] arm64: smp: Add arch support for backtrace using pseudo-NMI

2023-08-30 Thread Douglas Anderson
Enable arch_trigger_cpumask_backtrace() support on arm64. This enables
things much like they are enabled on arm32 (including some of the
funky logic around NR_IPI, nr_ipi, and MAX_IPI) but with the
difference that, unlike arm32, we'll try to enable the backtrace to
use pseudo-NMI.

NOTE: this patch is a squash of the little bit of code adding the
ability to mark an IPI to try to use pseudo-NMI plus the little bit of
code to hook things up for kgdb. This approach was decided upon in the
discussion of v9 [1].

This patch depends on commit 8d539b84f1e3 ("nmi_backtrace: allow
excluding an arbitrary CPU") since that commit changed the prototype
of arch_trigger_cpumask_backtrace(), which this patch implements.

[1] https://lore.kernel.org/r/ZORY51mF4alI41G1@FVFF77S0Q05N

Co-developed-by: Sumit Garg 
Signed-off-by: Sumit Garg 
Co-developed-by: Mark Rutland 
Signed-off-by: Mark Rutland 
Reviewed-by: Stephen Boyd 
Reviewed-by: Misono Tomohiro 
Signed-off-by: Douglas Anderson 
---

Changes in v12:
- Minor comment change to add "()" after nmi_trigger_cpumask_backtrace.
- Updated the commit hash of the commit this depends on.

Changes in v11:
- Adjust comment about NR_IPI/MAX_IPI.
- Don't use confusing "backed by" idiom in comment.
- Made arm64_backtrace_ipi() static.

Changes in v10:
- Backtrace now directly supported in smp.c
- Squash backtrace into patch adding support for pseudo-NMI IPIs.

Changes in v9:
- Added comments that we might not be using NMI always.
- Fold in v8 patch #10 ("Fallback to a regular IPI if NMI isn't enabled")
- Moved header file out of "include" since it didn't need to be there.
- Remove arm64_supports_nmi()
- Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI.
- arch_trigger_cpumask_backtrace() no longer returns bool

Changes in v8:
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- debug_ipi_setup() and debug_ipi_teardown() no longer take cpu param

 arch/arm64/include/asm/irq.h |  3 ++
 arch/arm64/kernel/smp.c  | 86 +++-
 2 files changed, 78 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fac08e18bcd5..50ce8b697ff3 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,9 @@
 
 #include 
 
+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+
 struct pt_regs;
 
 int set_handle_irq(void (*handle_irq)(struct pt_regs *));
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index a5848f1ef817..28c904ca499a 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -72,12 +73,18 @@ enum ipi_msg_type {
IPI_CPU_CRASH_STOP,
IPI_TIMER,
IPI_IRQ_WORK,
-   NR_IPI
+   NR_IPI,
+   /*
+* Any enum >= NR_IPI and < MAX_IPI is special and not tracable
+* with trace_ipi_*
+*/
+   IPI_CPU_BACKTRACE = NR_IPI,
+   MAX_IPI
 };
 
 static int ipi_irq_base __read_mostly;
 static int nr_ipi __read_mostly = NR_IPI;
-static struct irq_desc *ipi_desc[NR_IPI] __read_mostly;
+static struct irq_desc *ipi_desc[MAX_IPI] __read_mostly;
 
 static void ipi_setup(int cpu);
 
@@ -845,6 +852,22 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int 
cpu, struct pt_regs *regs
 #endif
 }
 
+static void arm64_backtrace_ipi(cpumask_t *mask)
+{
+   __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
+}
+
+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
+{
+   /*
+* NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name,
+* nothing about it truly needs to be implemented using an NMI, it's
+* just that it's _allowed_ to work with NMIs. If ipi_should_be_nmi()
+* returned false our backtrace attempt will just use a regular IPI.
+*/
+   nmi_trigger_cpumask_backtrace(mask, exclude_cpu, arm64_backtrace_ipi);
+}
+
 /*
  * Main handler for inter-processor interrupts
  */
@@ -888,6 +911,14 @@ static void do_handle_IPI(int ipinr)
break;
 #endif
 
+   case IPI_CPU_BACKTRACE:
+   /*
+* NOTE: in some cases this _won't_ be NMI context. See the
+* comment in arch_trigger_cpumask_backtrace().
+*/
+   nmi_cpu_backtrace(get_irq_regs());
+   break;
+
default:
pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
break;
@@ -909,6 +940,19 @@ static void smp_cross_call(const struct cpumask *target, 
unsigned int ipinr)
__ipi_send_mask(ipi_desc[ipinr], target);
 }
 
+static bool ipi_should_be_nmi(enum ipi_msg_type ipi)
+{
+   if (!system_uses_irq_prio_masking())
+   return false;
+
+   switch (ipi) {
+   case IPI_CPU_BACKTRACE:
+   return true;
+   

[Kgdb-bugreport] [PATCH v12 6/7] arm64: kgdb: Implement kgdb_roundup_cpus() to enable pseudo-NMI roundup

2023-08-30 Thread Douglas Anderson
Up until now we've been using the generic (weak) implementation for
kgdb_roundup_cpus() when using kgdb on arm64. Let's move to a custom
one. The advantage here is that, when pseudo-NMI is enabled on a
device, we'll be able to round up CPUs using pseudo-NMI. This allows
us to debug CPUs that are stuck with interrupts disabled. If
pseudo-NMIs are not enabled then we'll fallback to just using an IPI,
which is still slightly better than the generic implementation since
it avoids the potential situation described in the generic
kgdb_call_nmi_hook().

Co-developed-by: Sumit Garg 
Signed-off-by: Sumit Garg 
Reviewed-by: Daniel Thompson 
Reviewed-by: Stephen Boyd 
Signed-off-by: Douglas Anderson 
---
I debated whether this should be in "arch/arm64/kernel/smp.c" or if I
should try to find a way for it to go into "arch/arm64/kernel/kgdb.c".
In the end this is so little code that it didn't seem worth it to find
a way to export the IPI defines or to otherwise come up with some API
between kgdb.c and smp.c. If someone has strong feelings and wants
this to change, please shout and give details of your preferred
solution.

FWIW, it seems like ~half the other platforms put this in "smp.c" with
an ifdef for KGDB and the other half put it in "kgdb.c" with an ifdef
for SMP. :-P

(no changes since v10)

Changes in v10:
- Don't allocate the cpumask on the stack; just iterate.
- Moved kgdb calls to smp.c to avoid needing to export IPI info.
- kgdb now has its own IPI.

Changes in v9:
- Remove fallback for when debug IPI isn't available.
- Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI.

 arch/arm64/kernel/smp.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 800c59cf9b64..1a53e57c81d0 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -79,6 +80,7 @@ enum ipi_msg_type {
 * with trace_ipi_*
 */
IPI_CPU_BACKTRACE = NR_IPI,
+   IPI_KGDB_ROUNDUP,
MAX_IPI
 };
 
@@ -868,6 +870,22 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, 
int exclude_cpu)
nmi_trigger_cpumask_backtrace(mask, exclude_cpu, arm64_backtrace_ipi);
 }
 
+#ifdef CONFIG_KGDB
+void kgdb_roundup_cpus(void)
+{
+   int this_cpu = raw_smp_processor_id();
+   int cpu;
+
+   for_each_online_cpu(cpu) {
+   /* No need to roundup ourselves */
+   if (cpu == this_cpu)
+   continue;
+
+   __ipi_send_single(ipi_desc[IPI_KGDB_ROUNDUP], cpu);
+   }
+}
+#endif
+
 /*
  * Main handler for inter-processor interrupts
  */
@@ -919,6 +937,10 @@ static void do_handle_IPI(int ipinr)
nmi_cpu_backtrace(get_irq_regs());
break;
 
+   case IPI_KGDB_ROUNDUP:
+   kgdb_nmicallback(cpu, get_irq_regs());
+   break;
+
default:
pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
break;
@@ -949,6 +971,7 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi)
case IPI_CPU_STOP:
case IPI_CPU_CRASH_STOP:
case IPI_CPU_BACKTRACE:
+   case IPI_KGDB_ROUNDUP:
return true;
default:
return false;
-- 
2.42.0.283.g2d96d420d3-goog



___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport


[Kgdb-bugreport] [PATCH v12 2/7] arm64: idle: Tag the arm64 idle functions as __cpuidle

2023-08-30 Thread Douglas Anderson
As per the (somewhat recent) comment before the definition of
`__cpuidle`, the tag is like `noinstr` but also marks a function so it
can be identified by cpu_in_idle(). Let's add these markings to arm64
cpuidle functions

With this change we get useful backtraces like:

  NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98

instead of useless backtraces when dumping all processors using
nmi_cpu_backtrace().

NOTE: this patch won't make cpu_in_idle() work perfectly for arm64,
but it doesn't hurt and does catch some cases. Specifically an example
that wasn't caught in my testing looked like this:

 gic_cpu_sys_reg_init+0x1f8/0x314
 gic_cpu_pm_notifier+0x40/0x78
 raw_notifier_call_chain+0x5c/0x134
 cpu_pm_notify+0x38/0x64
 cpu_pm_exit+0x20/0x2c
 psci_enter_idle_state+0x48/0x70
 cpuidle_enter_state+0xb8/0x260
 cpuidle_enter+0x44/0x5c
 do_idle+0x188/0x30c

Acked-by: Mark Rutland 
Reviewed-by: Stephen Boyd 
Acked-by: Sumit Garg 
Signed-off-by: Douglas Anderson 
---

(no changes since v11)

Changes in v11:
- Updated commit message as per Stephen.

Changes in v9:
- Added to commit message that this doesn't catch all cases.

Changes in v8:
- "Tag the arm64 idle functions as __cpuidle" new for v8

 arch/arm64/kernel/idle.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index c1125753fe9b..05cfb347ec26 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -20,7 +20,7 @@
  * ensure that interrupts are not masked at the PMR (because the core will
  * not wake up if we block the wake up signal in the interrupt controller).
  */
-void noinstr cpu_do_idle(void)
+void __cpuidle cpu_do_idle(void)
 {
struct arm_cpuidle_irq_context context;
 
@@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
 /*
  * This is our default idle handler.
  */
-void noinstr arch_cpu_idle(void)
+void __cpuidle arch_cpu_idle(void)
 {
/*
 * This should do all the clock switching and wait for interrupt
-- 
2.42.0.283.g2d96d420d3-goog



___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport


[Kgdb-bugreport] [PATCH v12 0/7] arm64: Add IPI for backtraces / kgdb; try to use NMI for some IPIs

2023-08-30 Thread Douglas Anderson
This is an attempt to resurrect Sumit's old patch series [1] that
allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
also to round up CPUs in kdb/kgdb. The last post from Sumit that I
could find was v7, so I started my series at v8. I haven't copied all
of his old changelongs here, but you can find them from the link.

This patch series targets v6.6. Specifically it can't land in v6.5
since it depends on commit 8d539b84f1e3 ("nmi_backtrace: allow
excluding an arbitrary CPU").

It should be noted that Mark still feels there might be some corner
cases where pseudo-NMI is not production ready [2] [3], but as far as
I'm aware there are no concrete/documented issues. Regardless of
whether this should be enabled for production, though, this series
will be invaluable to anyone trying to debug crashes on arm64
machines.

v12 of this series collects tags, fixes a few small nits in comments
and commit messages from v11 and adds a new (and somewhat unrelated)
small patch to the end of the series. There are no code changes other
than the last patch, which is tiny.

v11 of this series addressed Stephen Boyd's feedback on v10 and added
a missing "static" that the patches robot found.

v10 of this series attempted to address all of Mark's feedback on
v9. As a quick summary:
- It includes his patch to remove IPI_WAKEUP, freeing up an extra IPI.
- It no longer combines the "kgdb" and "backtrace" IPIs. If we need
  another IPI these could always be recombined later.
- It promotes IPI_CPU_STOP and IPI_CPU_CRASH_STOP to NMI.
- It puts nearly all the code directly in smp.c.
- Several of the patches are squashed together.
- Patch #6 ("kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB")
  was dropped from the series since it landed.

Between v8 and v9, I had cleaned up this patch series by integrating
the 10th patch from v8 [4] into the whole series. As part of this, I
renamed the "NMI IPI" to the "debug IPI" since it could now be backed
by a regular IPI in the case that pseudo NMIs weren't available. With
the fallback, this allowed me to drop some extra patches from the
series. This feels (to me) to be pretty clean and hopefully others
agree. Any patch I touched significantly I removed Masayoshi and
Chen-Yu's tags from.

...also in v8, I reorderd the patches a bit in a way that seemed a
little cleaner to me.

Since v7, I have:
* Addressed the small amount of feedback that was there for v7.
* Rebased.
* Added a new patch that prevents us from spamming the logs with idle
  tasks.
* Added an extra patch to gracefully fall back to regular IPIs if
  pseudo-NMIs aren't there.

It can be noted that this patch series works very well with the recent
"hardlockup" patches that have landed through Andrew Morton's tree and
are currently in mainline. It works especially well with the "buddy"
lockup detector.

[1] 
https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.g...@linaro.org/
[2] 
https://lore.kernel.org/lkml/zfvgqd%2f%2fpm%2flz...@fvff77s0q05n.cambridge.arm.com/
[3] https://lore.kernel.org/lkml/zndkvp2m-iizc...@fvff77s0q05n.cambridge.arm.com
[4] 
https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid/

Changes in v12:
- ("arm64: smp: Mark IPI globals as __ro_after_init") new for v12.
- Added a comment about why we account for 16 SGIs when Linux uses 8.
- Minor comment change to add "()" after nmi_trigger_cpumask_backtrace.
- Updated the commit hash of the commit this depends on.

Changes in v11:
- Adjust comment about NR_IPI/MAX_IPI.
- Don't use confusing "backed by" idiom in comment.
- Made arm64_backtrace_ipi() static.
- Updated commit message as per Stephen.
- arch_send_wakeup_ipi() now takes an unsigned int.

Changes in v10:
- ("IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") new for v10.
- ("arm64: smp: Remove dedicated wakeup IPI") new for v10.
- Backtrace now directly supported in smp.c
- Don't allocate the cpumask on the stack; just iterate.
- Moved kgdb calls to smp.c to avoid needing to export IPI info.
- Rewrite as needed for 5.11+ as per Mark Rutland and Sumit.
- Squash backtrace into patch adding support for pseudo-NMI IPIs.
- kgdb now has its own IPI.

Changes in v9:
- Added comments that we might not be using NMI always.
- Added to commit message that this doesn't catch all cases.
- Fold in v8 patch #10 ("Fallback to a regular IPI if NMI isn't enabled")
- Moved header file out of "include" since it didn't need to be there.
- Remove arm64_supports_nmi()
- Remove fallback for when debug IPI isn't available.
- Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI.
- arch_trigger_cpumask_backtrace() no longer returns bool

Changes in v8:
- "Tag the arm64 idle functions as __cpuidle" new for v8
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- debug_ipi_setup() and debug_ipi_teardown() no longer take cpu param

Douglas Anderson (6):
  irqchip/gic-v3: Enable support for SGIs to act as NMIs
  arm64: 

[Kgdb-bugreport] [PATCH v12 3/7] arm64: smp: Remove dedicated wakeup IPI

2023-08-30 Thread Douglas Anderson
From: Mark Rutland 

To enable NMI backtrace and KGDB's NMI cpu roundup, we need to free up
at least one dedicated IPI.

On arm64 the IPI_WAKEUP IPI is only used for the ACPI parking protocol,
which itself is only used on some very early ARMv8 systems which
couldn't implement PSCI.

Remove the IPI_WAKEUP IPI, and rely on the IPI_RESCHEDULE IPI to wake
CPUs from the parked state. This will cause a tiny amonut of redundant
work to check the thread flags, but this is miniscule in relation to the
cost of taking and handling the IPI in the first place. We can safely
handle redundant IPI_RESCHEDULE IPIs, so there should be no functional
impact as a result of this change.

Signed-off-by: Mark Rutland 
Reviewed-by: Stephen Boyd 
Reviewed-by: Sumit Garg 
Signed-off-by: Douglas Anderson 
Cc: Catalin Marinas 
Cc: Marc Zyngier 
Cc: Will Deacon 
---
I have no idea how to test this. I just took Mark's patch and jammed
it into my series. Logicially the patch seems reasonable to me.

(no changes since v11)

Changes in v11:
- arch_send_wakeup_ipi() now takes an unsigned int.

Changes in v10:
- ("arm64: smp: Remove dedicated wakeup IPI") new for v10.

 arch/arm64/include/asm/smp.h  |  4 ++--
 arch/arm64/kernel/acpi_parking_protocol.c |  2 +-
 arch/arm64/kernel/smp.c   | 28 +--
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 9b31e6d0da17..efb13112b408 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -89,9 +89,9 @@ extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 
 #ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
-extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void arch_send_wakeup_ipi(unsigned int cpu);
 #else
-static inline void arch_send_wakeup_ipi_mask(const struct cpumask *mask)
+static inline void arch_send_wakeup_ipi(unsigned int cpu)
 {
BUILD_BUG();
 }
diff --git a/arch/arm64/kernel/acpi_parking_protocol.c 
b/arch/arm64/kernel/acpi_parking_protocol.c
index b1990e38aed0..e1be29e608b7 100644
--- a/arch/arm64/kernel/acpi_parking_protocol.c
+++ b/arch/arm64/kernel/acpi_parking_protocol.c
@@ -103,7 +103,7 @@ static int acpi_parking_protocol_cpu_boot(unsigned int cpu)
   >entry_point);
writel_relaxed(cpu_entry->gic_cpu_id, >cpu_id);
 
-   arch_send_wakeup_ipi_mask(cpumask_of(cpu));
+   arch_send_wakeup_ipi(cpu);
 
return 0;
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 960b98b43506..a5848f1ef817 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -72,7 +72,6 @@ enum ipi_msg_type {
IPI_CPU_CRASH_STOP,
IPI_TIMER,
IPI_IRQ_WORK,
-   IPI_WAKEUP,
NR_IPI
 };
 
@@ -764,7 +763,6 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
[IPI_CPU_CRASH_STOP]= "CPU stop (for crash dump) interrupts",
[IPI_TIMER] = "Timer broadcast interrupts",
[IPI_IRQ_WORK]  = "IRQ work interrupts",
-   [IPI_WAKEUP]= "CPU wake-up interrupts",
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr);
@@ -797,13 +795,6 @@ void arch_send_call_function_single_ipi(int cpu)
smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC);
 }
 
-#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
-void arch_send_wakeup_ipi_mask(const struct cpumask *mask)
-{
-   smp_cross_call(mask, IPI_WAKEUP);
-}
-#endif
-
 #ifdef CONFIG_IRQ_WORK
 void arch_irq_work_raise(void)
 {
@@ -897,14 +888,6 @@ static void do_handle_IPI(int ipinr)
break;
 #endif
 
-#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
-   case IPI_WAKEUP:
-   WARN_ONCE(!acpi_parking_protocol_valid(cpu),
- "CPU%u: Wake-up IPI outside the ACPI parking 
protocol\n",
- cpu);
-   break;
-#endif
-
default:
pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
break;
@@ -979,6 +962,17 @@ void arch_smp_send_reschedule(int cpu)
smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
 }
 
+#ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
+void arch_send_wakeup_ipi(unsigned int cpu)
+{
+   /*
+* We use a scheduler IPI to wake the CPU as this avoids the need for a
+* dedicated IPI and we can safely handle spurious scheduler IPIs.
+*/
+   arch_smp_send_reschedule(cpu);
+}
+#endif
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 void tick_broadcast(const struct cpumask *mask)
 {
-- 
2.42.0.283.g2d96d420d3-goog



___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport


[Kgdb-bugreport] [PATCH v12 1/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs

2023-08-30 Thread Douglas Anderson
As of commit 6abbd6988971 ("irqchip/gic, gic-v3: Make SGIs use
handle_percpu_devid_irq()") SGIs are treated the same as PPIs/EPPIs
and use handle_percpu_devid_irq() by default. Unfortunately,
handle_percpu_devid_irq() isn't NMI safe, and so to run in an NMI
context those should use handle_percpu_devid_fasteoi_nmi().

In order to accomplish this, we just have to make room for SGIs in the
array of refcounts that keeps track of which interrupts are set as
NMI. We also rename the array and create a new indexing scheme that
accounts for SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Co-developed-by: Sumit Garg 
Signed-off-by: Sumit Garg 
Signed-off-by: Douglas Anderson 
---
I'll note that this change is a little more black magic to me than
others in this series. I don't have a massive amounts of familiarity
with all the moving parts of gic-v3, so I mostly just followed Mark
Rutland's advice [1]. Please pay extra attention to make sure I didn't
do anything too terrible.

Mark's advice wasn't a full patch and I ended up doing a bit of work
to translate it to reality, so I did not add him as "Co-developed-by"
here. Mark: if you would like this tag then please provide it and your
Signed-off-by. I certainly won't object.

[1] https://lore.kernel.org/r/znc-yrqopo0pa...@fvff77s0q05n.cambridge.arm.com

Changes in v12:
- Added a comment about why we account for 16 SGIs when Linux uses 8.

Changes in v10:
- Rewrite as needed for 5.11+ as per Mark Rutland and Sumit.

 drivers/irqchip/irq-gic-v3.c | 59 +---
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index eedfa8e9f077..8d20122ba0a8 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -78,6 +78,13 @@ static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);
 #define GIC_LINE_NRmin(GICD_TYPER_SPIS(gic_data.rdists.gicd_typer), 1020U)
 #define GIC_ESPI_NRGICD_TYPER_ESPIS(gic_data.rdists.gicd_typer)
 
+/*
+ * There are 16 SGIs, though we only actually use 8 in Linux. The other 8 SGIs
+ * are potentially stolen by the secure side. Some code, especially code 
dealing
+ * with hwirq IDs, is simplified by accounting for all 16.
+ */
+#define SGI_NR 16
+
 /*
  * The behaviours of RPR and PMR registers differ depending on the value of
  * SCR_EL3.FIQ, and the behaviour of non-secure priority registers of the
@@ -125,8 +132,8 @@ EXPORT_SYMBOL(gic_nonsecure_priorities);
__priority; \
})
 
-/* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */
-static refcount_t *ppi_nmi_refs;
+/* rdist_nmi_refs[n] == number of cpus having the rdist interrupt n set as NMI 
*/
+static refcount_t *rdist_nmi_refs;
 
 static struct gic_kvm_info gic_v3_kvm_info __initdata;
 static DEFINE_PER_CPU(bool, has_rss);
@@ -519,9 +526,22 @@ static u32 __gic_get_ppi_index(irq_hw_number_t hwirq)
}
 }
 
-static u32 gic_get_ppi_index(struct irq_data *d)
+static u32 __gic_get_rdist_idx(irq_hw_number_t hwirq)
+{
+   switch (__get_intid_range(hwirq)) {
+   case SGI_RANGE:
+   case PPI_RANGE:
+   return hwirq;
+   case EPPI_RANGE:
+   return hwirq - EPPI_BASE_INTID + 32;
+   default:
+   unreachable();
+   }
+}
+
+static u32 gic_get_rdist_idx(struct irq_data *d)
 {
-   return __gic_get_ppi_index(d->hwirq);
+   return __gic_get_rdist_idx(d->hwirq);
 }
 
 static int gic_irq_nmi_setup(struct irq_data *d)
@@ -545,11 +565,14 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 
/* desc lock should already be held */
if (gic_irq_in_rdist(d)) {
-   u32 idx = gic_get_ppi_index(d);
+   u32 idx = gic_get_rdist_idx(d);
 
-   /* Setting up PPI as NMI, only switch handler for first NMI */
-   if (!refcount_inc_not_zero(_nmi_refs[idx])) {
-   refcount_set(_nmi_refs[idx], 1);
+   /*
+* Setting up a percpu interrupt as NMI, only switch handler
+* for first NMI
+*/
+   if (!refcount_inc_not_zero(_nmi_refs[idx])) {
+   refcount_set(_nmi_refs[idx], 1);
desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
}
} else {
@@ -582,10 +605,10 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
 
/* desc lock should already be held */
if (gic_irq_in_rdist(d)) {
-   u32 idx = gic_get_ppi_index(d);
+   u32 idx = gic_get_rdist_idx(d);
 
/* Tearing down NMI, only switch handler for last NMI */
-   if (refcount_dec_and_test(_nmi_refs[idx]))
+   if (refcount_dec_and_test(_nmi_refs[idx]))
desc->handle_irq = handle_percpu_devid_irq;
} else {
  

Re: [Kgdb-bugreport] [PATCH] kgdb: Flush console before entering kgdb on panic

2023-08-30 Thread Daniel Thompson
On Fri, Aug 25, 2023 at 07:18:44AM -0700, Doug Anderson wrote:
> Hi,
>
> On Fri, Aug 25, 2023 at 3:09 AM Daniel Thompson
>  wrote:
> >
> > On Tue, Aug 22, 2023 at 01:19:46PM -0700, Douglas Anderson wrote:
> > > When entering kdb/kgdb on a kernel panic, it was be observed that the
> > > console isn't flushed before the `kdb` prompt came up. Specifically,
> > > when using the buddy lockup detector on arm64 and running:
> > >   echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
> > >
> > > I could see:
> > >   [   26.161099] lkdtm: Performing direct entry HARDLOCKUP
> > >   [   32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6
> > >   [   32.552865] Sending NMI from CPU 5 to CPUs 6:
> > >   [   32.557359] NMI backtrace for cpu 6
> > >   ... [backtrace for cpu 6] ...
> > >   [   32.558353] NMI backtrace for cpu 5
> > >   ... [backtrace for cpu 5] ...
> > >   [   32.867471] Sending NMI from CPU 5 to CPUs 0-4,7:
> > >   [   32.872321] NMI backtrace forP cpuANC: Hard LOCKUP
> > >
> > >   Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry
> > >   [5]kdb>
> > >
> > > As you can see, backtraces for the other CPUs start printing and get
> > > interleaved with the kdb PANIC print.
> > >
> > > Let's replicate the commands to flush the console in the kdb panic
> > > entry point to avoid this.
> > >
> > > Signed-off-by: Douglas Anderson 
> > > ---
> > >
> > >  kernel/debug/debug_core.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > index d5e9ccde3ab8..3a904d8697c8 100644
> > > --- a/kernel/debug/debug_core.c
> > > +++ b/kernel/debug/debug_core.c
> > > @@ -1006,6 +1006,9 @@ void kgdb_panic(const char *msg)
> > >   if (panic_timeout)
> > >   return;
> > >
> > > + debug_locks_off();
> > > + console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > +
> > >   if (dbg_kdb_mode)
> > >   kdb_printf("PANIC: %s\n", msg);
> >
> > I'm somewhat included to say *this* (calling kdb_printf() when not
> > actually in the debugger) is the cause of the problem. kdb_printf()
> > does some pretty horid things to the console and isn't intended to
> > run while the system is active.
> >
> > I'd therefore be more tempted to defer the print to the b.p. trap
> > handler itself and make this part of kgdb_panic() look more like:
> >
> > kgdb_panic_msg = msg;
> > kgdb_breakpoint();
> > kgdb_panic_msg = NULL;
>
> Unfortunately I think that only solves half the problem. As a quick
> test, I tried simply commenting out the "kdb_printf" line in
> kgdb_panic(). While that avoids the interleaved panic message and
> backtrace, it does nothing to actually get the backtraces printed out
> before you end up in kdb. As an example, this is what happened when I
> used `echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT` and
> had the "kdb_printf" in kgdb_panic() commented out:
>
> [   72.658424] lkdtm: Performing direct entry HARDLOCKUP
> [   82.181857] watchdog: Watchdog detected hard LOCKUP on cpu 6
> ...
> [   82.234801] Sending NMI from CPU 5 to CPUs 6:
> [   82.239296] NMI backtrace for cpu 6
> ... [ stack trace for CPU 6 ] ...
> [   82.240294] NMI backtrace for cpu 5
> ... [ stack trace for CPU 5 ] ...
> [   82.576443] Sending NMI from CPU 5 to CPUs 0-4,7:
> [   82.581291] NMI backtrace
> Entering kdb (current=0xff80da5a1080, pid 6978) on processor 5 due
> to Keyboard Entry
> [5]kdb>
>
> As you can see, I don't see the traces for CPUs 0-4 and 7. Those do
> show up if I use the "dmesg" command but it's a bit of a hassle to run
> "dmesg" to look for any extra debug messages every time I drop in kdb.
>
> I guess perhaps that part isn't obvious from the commit message?

I figured it was a risk.

In fact it's an area where my instinct to honour console messages and my
instinct to get into the kernel as soon as possible after the decision
to invoke it has been made come into conflict.

In other words does it matter that the console buffers are not flushed
before entering kgdb? However having thought about it for a little while
(and knowing the console code tends to be written to be decently robust)
I can come to the view the flushing is best.


> Should I send a new version with an updated commit message indicating
> that it's not just the jumbled text that's a problem but also the lack
> of stack traces?

No real need.

I don't really like seeing kdb_printf() being called from here but
having reviewed a bit of console code I think we can might be able
to use the new infrastructure to make kdb_printf() a slightly less
hateful ;-).


Daniel.


___
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport