[Issue] If one cpu ,which is taking a psinfo->buf_lock, receive NMI from a panicked cpu via smp_send_stop(), the panicked cpu hangs up in pstore_dump() called by kmsg_dump(KMSG_DUMP_PANIC) because the psinfo->buf_lock is taken again in it.
To avoid the deadlock, an easy solution is moving kmsg_dump above smp_send_stop() in panic path. But, it is not safe to kick pstore while multiple cpus are running in panic case, because they may touch corrupted data/variables and unnecessary failures may happen. In that case, we can't guarantee that a panicked cpu can log messages reliably because it may have harmful effects due to the failures. [Solution] This patch skips taking a psinfo->buf_lock when just one cpu is online because stopped cpus turn to offline via smp_send_stop() in some architectures like x86, powerpc or arm64. It may be a hack but solves my concern deadlocking in x86 architecture. Signed-off-by: Seiji Aguchi <seiji.agu...@hds.com> --- fs/pstore/platform.c | 14 +++++++++++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index 947fbe0..ca4d2ab 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -107,7 +107,7 @@ static void pstore_dump(struct kmsg_dumper *dumper, unsigned long total = 0; const char *why; u64 id; - unsigned int part = 1; + unsigned int part = 1, cpu_num = num_online_cpus(); unsigned long flags = 0; int is_locked = 0; int ret; @@ -118,8 +118,14 @@ static void pstore_dump(struct kmsg_dumper *dumper, is_locked = spin_trylock(&psinfo->buf_lock); if (!is_locked) pr_err("pstore dump routine blocked in NMI, may corrupt error record\n"); - } else + } else if (cpu_num > 1) { + /* + * Take a spin lock only when multiple cpus are online. + */ spin_lock_irqsave(&psinfo->buf_lock, flags); + } else + local_irq_save(flags); + oopscount++; while (total < kmsg_bytes) { char *dst; @@ -146,8 +152,10 @@ static void pstore_dump(struct kmsg_dumper *dumper, if (in_nmi()) { if (is_locked) spin_unlock(&psinfo->buf_lock); - } else + } else if (cpu_num > 1) { spin_unlock_irqrestore(&psinfo->buf_lock, flags); + } else + local_irq_restore(flags); } static struct kmsg_dumper pstore_dumper = { -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/