Re: [PATCH v2 08/13] tracing: Improve panic/die notifiers
On Tue, 16 Aug 2022 10:57:20 -0400 Alan Stern wrote: > > static int trace_die_panic_handler(struct notifier_block *self, > > unsigned long ev, void *unused) > > { > > if (!ftrace_dump_on_oops) > > return NOTIFY_DONE; > > > > /* The die notifier requires DIE_OOPS to trigger */ > > if (self == _die_notifier && ev != DIE_OOPS) > > return NOTIFY_DONE; > > > > ftrace_dump(ftrace_dump_on_oops); > > > > return NOTIFY_DONE; > > } > > Or better yet: > > if (ftrace_dump_on_oops) { > > /* The die notifier requires DIE_OOPS to trigger */ > if (self != _die_notifier || ev == DIE_OOPS) > ftrace_dump(ftrace_dump_on_oops); > } > return NOTIFY_DONE; > That may be more consolidated but less easy to read and follow. This is far from a fast path. As I maintain this bike-shed, I prefer the one I suggested ;-) -- Steve ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 08/13] tracing: Improve panic/die notifiers
On Tue, 19 Jul 2022 16:53:21 -0300 "Guilherme G. Piccoli" wrote: Sorry for the late review, but this fell to the bottom of my queue :-/ > +/* > + * The idea is to execute the following die/panic callback early, in order > + * to avoid showing irrelevant information in the trace (like other panic > + * notifier functions); we are the 2nd to run, after hung_task/rcu_stall > + * warnings get disabled (to prevent potential log flooding). > + */ > +static int trace_die_panic_handler(struct notifier_block *self, > + unsigned long ev, void *unused) > +{ > + if (!ftrace_dump_on_oops) > + goto out; > + > + if (self == _die_notifier && ev != DIE_OOPS) > + goto out; I really hate gotos that are not for clean ups. > + > + ftrace_dump(ftrace_dump_on_oops); > + > +out: > + return NOTIFY_DONE; > +} > + Just do: static int trace_die_panic_handler(struct notifier_block *self, unsigned long ev, void *unused) { if (!ftrace_dump_on_oops) return NOTIFY_DONE; /* The die notifier requires DIE_OOPS to trigger */ if (self == _die_notifier && ev != DIE_OOPS) return NOTIFY_DONE; ftrace_dump(ftrace_dump_on_oops); return NOTIFY_DONE; } Thanks, Other than that, Acked-by: Steven Rostedt (Google) -- Steve ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 08/13] tracing: Improve panic/die notifiers
On Tue, Aug 16, 2022 at 10:14:45AM -0400, Steven Rostedt wrote: > On Tue, 19 Jul 2022 16:53:21 -0300 > "Guilherme G. Piccoli" wrote: > > > Sorry for the late review, but this fell to the bottom of my queue :-/ > > > +/* > > + * The idea is to execute the following die/panic callback early, in order > > + * to avoid showing irrelevant information in the trace (like other panic > > + * notifier functions); we are the 2nd to run, after hung_task/rcu_stall > > + * warnings get disabled (to prevent potential log flooding). > > + */ > > +static int trace_die_panic_handler(struct notifier_block *self, > > + unsigned long ev, void *unused) > > +{ > > + if (!ftrace_dump_on_oops) > > + goto out; > > + > > + if (self == _die_notifier && ev != DIE_OOPS) > > + goto out; > > I really hate gotos that are not for clean ups. > > > + > > + ftrace_dump(ftrace_dump_on_oops); > > + > > +out: > > + return NOTIFY_DONE; > > +} > > + > > Just do: > > static int trace_die_panic_handler(struct notifier_block *self, > unsigned long ev, void *unused) > { > if (!ftrace_dump_on_oops) > return NOTIFY_DONE; > > /* The die notifier requires DIE_OOPS to trigger */ > if (self == _die_notifier && ev != DIE_OOPS) > return NOTIFY_DONE; > > ftrace_dump(ftrace_dump_on_oops); > > return NOTIFY_DONE; > } Or better yet: if (ftrace_dump_on_oops) { /* The die notifier requires DIE_OOPS to trigger */ if (self != _die_notifier || ev == DIE_OOPS) ftrace_dump(ftrace_dump_on_oops); } return NOTIFY_DONE; Alan Stern > Thanks, > > Other than that, Acked-by: Steven Rostedt (Google) > > -- Steve ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes
On 8/8/22 05:41, Baoquan He wrote: On 07/21/22 at 02:17pm, Eric DeVolder wrote: This introduces the crash_hotplug attribute for memory and CPUs for use by userspace. This change directly facilitates the udev rule for managing userspace re-loading of the crash kernel upon hot un/plug changes. For memory, this changeset introduces the crash_hotplug attribute to the /sys/devices/system/memory directory. For example: # udevadm info --attribute-walk /sys/devices/system/memory/memory81 looking at device '/devices/system/memory/memory81': KERNEL=="memory81" SUBSYSTEM=="memory" DRIVER=="" ATTR{online}=="1" ATTR{phys_device}=="0" ATTR{phys_index}=="0051" ATTR{removable}=="1" ATTR{state}=="online" ATTR{valid_zones}=="Movable" looking at parent device '/devices/system/memory': KERNELS=="memory" SUBSYSTEMS=="" DRIVERS=="" ATTRS{auto_online_blocks}=="offline" ATTRS{block_size_bytes}=="800" ATTRS{crash_hotplug}=="1" For CPUs, this changeset introduces the crash_hotplug attribute to the /sys/devices/system/cpu directory. For example: # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 looking at device '/devices/system/cpu/cpu0': KERNEL=="cpu0" SUBSYSTEM=="cpu" DRIVER=="processor" ATTR{crash_notes}=="277c38600" ATTR{crash_notes_size}=="368" ATTR{online}=="1" looking at parent device '/devices/system/cpu': KERNELS=="cpu" SUBSYSTEMS=="" DRIVERS=="" ATTRS{crash_hotplug}=="1" ATTRS{isolated}=="" ATTRS{kernel_max}=="8191" ATTRS{nohz_full}==" (null)" ATTRS{offline}=="4-7" ATTRS{online}=="0-3" ATTRS{possible}=="0-7" ATTRS{present}=="0-3" With these sysfs attributes in place, it is possible to efficiently instruct the udev rule to skip crash kernel reloading. For example, the following is the proposed udev rule change for RHEL system 98-kexec.rules (as the first lines of the rule file): # The kernel handles updates to crash elfcorehdr for cpu and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" When examined in the context of 98-kexec.rules, the above change tests if crash_hotplug is set, and if so, it skips the userspace initiated unload-then-reload of the crash kernel. Cpu and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. If an architecture supports, for example, memory hotplug but not CPU hotplug, then the /sys/devices/system/memory/crash_hotplug attribute file is present, but the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be present. Thus the udev rule will skip userspace processing of memory hot un/plug events, but the udev rule will fail for CPU events, thus allowing userspace to process cpu hot un/plug events (ie the unload-then-reload of the kdump capture kernel). Signed-off-by: Eric DeVolder LGTM, Acked-by: Baoquan He Awesome, thank you! eric --- .../admin-guide/mm/memory-hotplug.rst | 8 Documentation/core-api/cpu_hotplug.rst | 18 ++ drivers/base/cpu.c | 14 ++ drivers/base/memory.c | 13 + include/linux/crash_core.h | 6 ++ 5 files changed, 59 insertions(+) diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 0f56ecd8ac05..494d7a63c543 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst @@ -293,6 +293,14 @@ The following files are currently defined: Availability depends on the CONFIG_ARCH_MEMORY_PROBE kernel configuration option. ``uevent`` read-write: generic udev file for device subsystems. +``crash_hotplug`` read-only: when changes to the system memory map + occur due to hot un/plug of memory, this file contains + '1' if the kernel updates the kdump capture kernel memory + map itself (via elfcorehdr), or '0' if userspace must update + the kdump capture kernel memory map. + + Availability depends on the CONFIG_MEMORY_HOTPLUG kernel + configuration option. == = .. note:: diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst index c6f4ba2fb32d..13e33d098645 100644 --- a/Documentation/core-api/cpu_hotplug.rst +++ b/Documentation/core-api/cpu_hotplug.rst @@ -750,6 +750,24 @@ will receive all events. A script like:: can process the event further. +When changes to the CPUs in the system occur, the sysfs file
Re: [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
On 8/12/22 19:34, Baoquan He wrote: On 07/21/22 at 02:17pm, Eric DeVolder wrote: ...snip diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e58798f636d4..bb59596c8bea 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2065,6 +2065,17 @@ config CRASH_DUMP (CONFIG_RELOCATABLE=y). For more details see Documentation/admin-guide/kdump/kdump.rst +config CRASH_MAX_MEMORY_RANGES + depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG) + int + default 32768 Do we need to enforce the value with page align and minimal size? I Are you asking about the value CRASH_MAX_MEMORY_RANGES? This value represents the maximum number of memory ranges, and there Elf64_Phdrs, that we need to allow for elfcorehdr memory. So I'm not sure what the concern for alignment is. I suppose we could also institute a minimum size for this value, say 1024. checked crash_load_segments() in arch/x86/kernel/crash.c, it does the page size aligning in kexec_add_buffer(). And in load_crashdump_segments() of kexec-tools/kexec/arch/i386/crashdump-x86.c, it creates elfcorehdr at below code, the align is 1024, and in generic add_buffer() implementation, it enforces the memsz page aligned, and changes the passed align as page alignment. elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base, max_addr, -1); Maybe we should at least mention this in the help text to notice people. Unfortunately I do not yet understand the concern being raised. + help + For the kexec_file_load path, specify the maximum number of + memory regions, eg. as represented by the 'System RAM' entries + in /proc/iomem, that the elfcorehdr buffer/segment can accommodate. + This value is combined with NR_CPUS and multiplied by Elf64_Phdr + size to determine the final buffer size. + config KEXEC_JUMP bool "kexec jump" depends on KEXEC && HIBERNATION diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h index 8b6bd63530dc..96051d8e4b45 100644 --- a/arch/x86/include/asm/crash.h +++ b/arch/x86/include/asm/crash.h @@ -9,4 +9,24 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params); void crash_smp_send_stop(void); +void *arch_map_crash_pages(unsigned long paddr, unsigned long size); +#define arch_map_crash_pages arch_map_crash_pages + +void arch_unmap_crash_pages(void **ptr); +#define arch_unmap_crash_pages arch_unmap_crash_pages + +void arch_crash_handle_hotplug_event(struct kimage *image, + unsigned int hp_action, unsigned int cpu); +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event + +#ifdef CONFIG_HOTPLUG_CPU +static inline int crash_hotplug_cpu_support(void) { return 1; } +#define crash_hotplug_cpu_support crash_hotplug_cpu_support +#endif + +#ifdef CONFIG_MEMORY_HOTPLUG +static inline int crash_hotplug_memory_support(void) { return 1; } +#define crash_hotplug_memory_support crash_hotplug_memory_support +#endif + #endif /* _ASM_X86_CRASH_H */ diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 9ceb93c176a6..55dda4fcde6e 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image) image->elf_headers = kbuf.buffer; image->elf_headers_sz = kbuf.bufsz; +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG) + /* Ensure elfcorehdr segment large enough for hotplug changes */ + kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr); Do we need to break the line to 80 chars? Sure, I will do so. + /* For marking as usable to crash kernel */ + image->elf_headers_sz = kbuf.memsz; Do we need this code comment? Well, it did take me a while to figure this particular item out in order for all this code to work right (else the crash kernel would fail at boot time). So I think it best to keep this comment. + /* Record the index of the elfcorehdr segment */ + image->elfcorehdr_index = image->nr_segments; And this place? Not necessarily needed, but I've found it useful. + image->elfcorehdr_index_valid = true; +#else kbuf.memsz = kbuf.bufsz; +#endif kbuf.buf_align = ELF_CORE_HEADER_ALIGN; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(); @@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image) return ret; } #endif /* CONFIG_KEXEC_FILE */ + +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG) +void *arch_map_crash_pages(unsigned long paddr, unsigned long size) +{ + /* +* NOTE: The addresses and sizes passed to this routine have +* already been fully aligned on
Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
On 8/12/22 19:24, Baoquan He wrote: On 08/12/22 at 04:23pm, Eric DeVolder wrote: On 8/12/22 04:46, Baoquan He wrote: On 08/08/22 at 10:18am, Eric DeVolder wrote: On 8/7/22 22:25, Baoquan He wrote: Hi Eric, On 07/21/22 at 02:17pm, Eric DeVolder wrote: The use of __weak is being eliminated within kexec sources. The technique uses macros mapped onto inline functions in order to replace __weak. This patchset was using __weak and so in order to replace __weak, this patch introduces arch/*/asm/crash.h, patterned after how kexec is moving away from __weak and to the macro definitions. Are you going to replace __weak in kexec of arll ARCHes? I don't see your point why all these empty header files are introduced. Wondering what's impacted if not adding these empty files? Hi Baoquan, In this patchset, to file include/linux/crash_core.h I added the line #include . I patterned this after how include/linux/kexec.h does #include . I am sorry, Eric, it looks not so good. I understand you want to pattern asm/kexe.h, but we need consider reality. Introducing a dozen of empty header file and not being able to tell when they will be filled doesn't make sense. Includig where needed is much simpler. I doubt if your way can pass other reviewers' line. Can you reconsider? If I include where needed, which is kernel/crash_core.c, then the other archs will fail build if that file doesn't exist. A couple of options, which do you think is better to pursue? - use asm/kexec.h instead of asm/crash.h; it appears all the architectures already have this file in place - go ahead and put the appropriate crash macros/inline functions into each arch asm/crash.h so that the files are not just empty, and leave the use of asm/crash.h I think we can do this in two steps. Firstly, make do with asm/kexec.h since all other ARCHes put crash related stuff into asm/kexec.h, except of x86. Baoquan, I've made the changes to utilizes asm/kexec.h. For x86, I did have to put what I previously had in asm/crash.h into asm/kexec.h. Secondly, clean up to put those crash marco/inline functions into asm/crash.h. Yes, in looking at kexec.h, there is alot of crash items in there that would make for a good cleanup pass... The 2nd step can be done in a independent patchset. What do you think? Yes! eric ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec