> On 9. Nov 2023, at 17:19, Corey Minyard <miny...@acm.org> wrote:
> 
>> 
>> I saw a lot of work around kprint() happened in and after 5.15 so I guess 
>> upgrading to 5.15 shouldn’t hurt in any case.
>> 
>> Not having a reproducer is the real bummer.
>> 
>> I was also wondering whether using other utilities like storing kernel crash 
>> dumps on swap would be a good avenue. The only issue here being that this is 
>> a KVM/Qemu host with lots of RAM and I think I don’t have enough disk space 
>> to capture full system memory dumps … 
> 
> You might be surprised, it is probably smaller than you think.

Consider me interested … 

> Actually, I'm confusing this with another issue dealing with real time
> latencies and printk.  Preempt tracing won't help your issue.
> 
> A assume you are using the "normal" NMI watchdog and it's not
> triggerring.  It should be on by default.  You can look in
> /proc/sys/kenel/nmi_watchdog to see.

Hmm … 

# cat /proc/sys/kernel/nmi_watchdog
cat: /proc/sys/kernel/nmi_watchdog: No such file or directory

I’m not sure whether our kernel config is missing something … config.gz shows 
me:

root@kyle34 [prod] .../sys/kernel # zgrep NMI /proc/config.gz
CONFIG_PRINTK_NMI=y
CONFIG_HAVE_ACPI_APEI_NMI=y
CONFIG_OPROFILE_NMI_TIMER=y
CONFIG_HAVE_NMI=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_HPWDT_NMI_DECODING=y
CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
# CONFIG_DEBUG_NMI_SELFTEST is not set

root@kyle34 [prod] .../sys/kernel # zgrep WATCH /proc/config.gz
# CONFIG_WATCH_QUEUE is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ACPI_WATCHDOG=y
CONFIG_IPMI_WATCHDOG=m
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=m
# CONFIG_WATCHDOG_NOWAYOUT is not set
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
CONFIG_WATCHDOG_OPEN_TIMEOUT=0
# CONFIG_WATCHDOG_SYSFS is not set
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
CONFIG_SOFT_WATCHDOG=m
CONFIG_DA9063_WATCHDOG=m
CONFIG_DA9062_WATCHDOG=m
CONFIG_MENF21BMC_WATCHDOG=m
CONFIG_MENZ069_WATCHDOG=m
CONFIG_XILINX_WATCHDOG=m
CONFIG_ZIIRAVE_WATCHDOG=m
CONFIG_RAVE_SP_WATCHDOG=m
CONFIG_CADENCE_WATCHDOG=m
CONFIG_DW_WATCHDOG=m
CONFIG_MAX63XX_WATCHDOG=m
CONFIG_RETU_WATCHDOG=m
CONFIG_SBC_FITPC2_WATCHDOG=m
CONFIG_HP_WATCHDOG=m
CONFIG_SBC_EPX_C3_WATCHDOG=m
CONFIG_PCIPCWATCHDOG=m
CONFIG_USBPCWATCHDOG=m
CONFIG_COMEDI_ADDI_WATCHDOG=m
# CONFIG_WQ_WATCHDOG is not set

# sysctl -a | grep nmi
kernel.panic_on_io_nmi = 0
kernel.panic_on_unrecovered_nmi = 0
kernel.unknown_nmi_panic = 0

This is an area I haven’t touched before. I’m a bit confused that the file 
doesn’t exist. The internet talks about corresponding sysctl settings which 
also do not exist … 

I’m seeing a number of options to control whether other situations should 
result in panics which apparently we don’t set and aren’t set by default:

root@kyle34 [prod] .../sys/kernel # sysctl -a | grep panic
fs.xfs.panic_mask = 0
kernel.hung_task_panic = 0
kernel.panic = 1
kernel.panic_on_io_nmi = 0
kernel.panic_on_oops = 0
kernel.panic_on_rcu_stall = 0
kernel.panic_on_unrecovered_nmi = 0
kernel.panic_on_warn = 0
kernel.panic_print = 0
kernel.unknown_nmi_panic = 0
vm.panic_on_oom = 0

And so, looking around I find:

CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
# CONFIG_SOFTLOCKUP_DETECTOR is not set
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
# CONFIG_HARDLOCKUP_DETECTOR is not set
CONFIG_TEST_LOCKUP=m

Reading the kernel docs about this seems like this might be an oversight from 
our side and we might be experiencing lockups that do not result in panics 
(which in turn thus won’t show up in the SEL).

I guess we’ve found some more homework we can do on our side to get better 
visibility.

AFAICT I can easily trigger an NMI from ipmi and then verify that this causes 
proper SEL entries … 

(In any case, the NMI lockup isn’t 100% convincing as it completely stopped 
happening since we attached the SOLs …)

> I was working with a customer of our companies on something similar, a
> watchdog reset with not discernable reason.  They couldn't use the
> standard NMI watchdog, so we switched to using an NMI watch from the
> BMC.  Set the preaction to nmi and the preop to panic.
> 
> With that, you can take a kernel coredump.  You generally only take a
> coredump of kernel memory (without buffers), so it's not generally a
> huge amount of disk space, and it's compressed.
> 
> Of course, then you have to analyze a coredump, which has its own
> difficulties :-(.

Thanks, that does sound more feasible than dumping all application memory 
state. And yes, dumps are a hassle, so I’ll invest in our homework to get the 
NMI stuff working first.

Hugs,
Christian

-- 
Christian Theune · c...@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick



_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to