Re: [PATCH 1/3] memblock: define functions to set the usable memory range
On Fri, Jan 14, 2022 at 12:10:46AM +, Frank van der Linden wrote: > Thanks for discussing this further! I tried the above change, although > I wrapped it in an if (cap_mem_size != 0), so that a normal kernel > doesn't get its entire memory marked as reserved. Just to clarify: I had to do that because the code is slightly different on 5.15, where I tried it. It wasn't a bug in the proposed patch. Anyway, will try 5.17-rc tomorrow, I'll let you know, thanks again. - Frank ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 1/3] memblock: define functions to set the usable memory range
On Thu, Jan 13, 2022 at 07:33:11PM +0200, Mike Rapoport wrote: > On Tue, Jan 11, 2022 at 08:44:41PM +, Frank van der Linden wrote: > > On Tue, Jan 11, 2022 at 12:31:58PM +0200, Mike Rapoport wrote: > > > > --- a/include/linux/memblock.h > > > > +++ b/include/linux/memblock.h > > > > @@ -481,6 +481,8 @@ phys_addr_t memblock_reserved_size(void); > > > > phys_addr_t memblock_start_of_DRAM(void); > > > > phys_addr_t memblock_end_of_DRAM(void); > > > > void memblock_enforce_memory_limit(phys_addr_t memory_limit); > > > > +void memblock_set_usable_range(phys_addr_t base, phys_addr_t size); > > > > +void memblock_enforce_usable_range(void); > > > > void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size); > > > > void memblock_mem_limit_remove_map(phys_addr_t limit); > > > > > > We already have 3 very similar interfaces that deal with memory capping. > > > Now you suggest to add fourth that will "generically" solve a single use > > > case of DT, EFI and kdump interaction on arm64. > > > > > > Looks like a workaround for a fundamental issue of incompatibility between > > > DT and EFI wrt memory registration. > > > > Yep, I figured this would be the main argument against this - arm64 > > already added several other more-or-less special cased interfaces over > > time. > > > > I'm more than happy to solve this in a different way. > > > > What would you suggest: > > > > 1) Try to merge the similar interfaces in to one. > > 2) Just deal with it at a lower (arm64) level? > > 3) Some other way? > > We've discussed this with Ard on IRC, and our conclusion was that on arm64 > kdump kernel should have memblock.memory exactly the same as the normal > kernel. Then, the memory outside usable-memory-range should be reserved so > that kdump kernel won't step over it. > > With that, simple (untested) patch below could be what we need: > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c > index bdca35284ceb..371418dffaf1 100644 > --- a/drivers/of/fdt.c > +++ b/drivers/of/fdt.c > @@ -1275,7 +1275,8 @@ void __init early_init_dt_scan_nodes(void) > of_scan_flat_dt(early_init_dt_scan_memory, NULL); > > /* Handle linux,usable-memory-range property */ > - memblock_cap_memory_range(cap_mem_addr, cap_mem_size); > + memblock_reserve(0, cap_mem_addr); > + memblock_reserve(cap_mem_addr + cap_mem_size, PHYS_ADDR_MAX); > } > > bool __init early_init_dt_scan(void *params) Thanks for discussing this further! I tried the above change, although I wrapped it in an if (cap_mem_size != 0), so that a normal kernel doesn't get its entire memory marked as reserved. The crash kernel hung without producing output - I haven't chased that down. This particular kernel was a bit older (5.15), so I'll try again with 5.17-rc to make sure. - Frank ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 1/3] memblock: define functions to set the usable memory range
On Tue, Jan 11, 2022 at 08:44:41PM +, Frank van der Linden wrote: > On Tue, Jan 11, 2022 at 12:31:58PM +0200, Mike Rapoport wrote: > > > --- a/include/linux/memblock.h > > > +++ b/include/linux/memblock.h > > > @@ -481,6 +481,8 @@ phys_addr_t memblock_reserved_size(void); > > > phys_addr_t memblock_start_of_DRAM(void); > > > phys_addr_t memblock_end_of_DRAM(void); > > > void memblock_enforce_memory_limit(phys_addr_t memory_limit); > > > +void memblock_set_usable_range(phys_addr_t base, phys_addr_t size); > > > +void memblock_enforce_usable_range(void); > > > void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size); > > > void memblock_mem_limit_remove_map(phys_addr_t limit); > > > > We already have 3 very similar interfaces that deal with memory capping. > > Now you suggest to add fourth that will "generically" solve a single use > > case of DT, EFI and kdump interaction on arm64. > > > > Looks like a workaround for a fundamental issue of incompatibility between > > DT and EFI wrt memory registration. > > Yep, I figured this would be the main argument against this - arm64 > already added several other more-or-less special cased interfaces over > time. > > I'm more than happy to solve this in a different way. > > What would you suggest: > > 1) Try to merge the similar interfaces in to one. > 2) Just deal with it at a lower (arm64) level? > 3) Some other way? We've discussed this with Ard on IRC, and our conclusion was that on arm64 kdump kernel should have memblock.memory exactly the same as the normal kernel. Then, the memory outside usable-memory-range should be reserved so that kdump kernel won't step over it. With that, simple (untested) patch below could be what we need: diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index bdca35284ceb..371418dffaf1 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -1275,7 +1275,8 @@ void __init early_init_dt_scan_nodes(void) of_scan_flat_dt(early_init_dt_scan_memory, NULL); /* Handle linux,usable-memory-range property */ - memblock_cap_memory_range(cap_mem_addr, cap_mem_size); + memblock_reserve(0, cap_mem_addr); + memblock_reserve(cap_mem_addr + cap_mem_size, PHYS_ADDR_MAX); } bool __init early_init_dt_scan(void *params) > Thanks, > > - Frank > -- Sincerely yours, Mike. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH V2] panic: Move panic_print before kmsg dumpers
On 13/01/2022 11:22, Petr Mladek wrote: > [...] > OK, do we have any specific reason why panic_print_sys_info() > should get called right before kmsg_dump() when this code patch > is used? > > Alternative solution would be to remove the check of > kexec_crash_loaded() and always call panic_print_sys_info(false) > at the beginning (after kgdb_panic(buf)). > > The advantage is that panic_print_sys_info(false) will be always > called on the same location. It will give the same results > in all code paths so that it will be easier to interpret them. > And it will have the same problems so it should be easier > to debug and maintain. > > It is possible that it will not work for some users. Also it is > possible that it might cause some problems. But it is hard to > guess at least for me. > > I think that we might try it and see if anyone complains. > Honestly, I think that only few people use panic_printk_sys_info(). > And your use-case makes sense. > > Best Regards, > Petr Hi Petr, thanks for your idea - it's simple and effective, would resolve the problems in a straightforward way. But there are some cons, let me detail more. Currently (in linux-next), if users set panic_print but not kdump, the panic_print_sys_info() is called *after* the panic notifiers. If user has kdump configured and still sets panic_print (which is our use case), then panic_print_sys_info() is called _before_ the panic notifiers. In other words, the behavior for non-kdump users is the same as before, since the merge of panic_print functionality. Your idea brings the printing earlier, always before the panic notifiers. This isn't terrible, but might lead to unfortunate and hard-to-debug problems; for example, some panic notifiers are rcu_panic() and hung_task_panic(), both are simple functions to disable RCU warnings and hung task detector in panic scenarios. If we go with your idea, these won't get called before panic_print_sys_info(), even if kdump is not set. So, since the panic_print triggers a lot of printing in the console, we could face a stall and trigger RCU messages, maybe intermixed with the panic_print information. Again, this is not the end of the world, we could live with that. But keeping two calls - and guarding against double print using kexec_crash_loaded() as per your idea - is completely non-invasive and doesn't change the current behavior. So, let me know if we should go with your idea or keep both calls but guarding against double printing to keep the current behavior untouched. Both ways work for me, I'm slightly inclined to the latter, but I can rework the patch if you prefer and use your idea! Cheers, Guilherme ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH V2] panic: Move panic_print before kmsg dumpers
On Thu 2022-01-13 09:34:01, Guilherme G. Piccoli wrote: > On 13/01/2022 08:50, Petr Mladek wrote: > >> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...) > >> * show some extra information on kernel log if it was set... > >> */ > >>if (kexec_crash_loaded()) > >> - panic_print_sys_info(); > >> + panic_print_sys_info(false); > > > > panic_print_sys_info(false) will be called twice when both > > kexec_crash_loaded() and _crash_kexec_post_notifiers are true. > > > > Do we really need to call panic_print_sys_info() here? All information > > provided by panic_print_sys_info(false) can be found also in > > the crash dump. > > > >>/* > >> * If we have crashed and we have a crash kernel loaded let it handle > >> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...) > >> */ > >>atomic_notifier_call_chain(_notifier_list, 0, buf); > >> > >> + panic_print_sys_info(false); > > > > This is where the info might be printed 2nd time. > > > >> + > >>kmsg_dump(KMSG_DUMP_PANIC); > Thanks for catching this issue - indeed, if > "_crash_kexec_post_notifiers" is true, with this patch we print stuff > twice. I will submit a V3 that guards against that, using a bool, makes > sense to you? It might be possible to check kexec_crash_loaded() on the two locations. But I think about even easier solution, see below. > The interesting question here is: > > Do we really need to call panic_print_sys_info() here? All information > > provided by panic_print_sys_info(false) can be found also in > > the crash dump. > > So, we indeed need that in our use case. Crash is meant to be used > post-mortem, i.e., you made a full vmcore collection and then, of > course, you have basically all the data you need accessible though the > crash tool. > > Problem is: in our use case, we want more data than a regular dmesg in a > panic event (hence we use panic_print), but we don't collect a full > crash dump, due to its big size. Also, as you can imagine, we do favor > pstore over kdump, but it might fail due to a variety of reasons (like > not having a free RAM buffer for ramoops), so kdump is our fallback. > Hence, we'd like to be able to use panic_print with both kdump and > pstore, and for that, both patches are needed. Fair enough. OK, do we have any specific reason why panic_print_sys_info() should get called right before kmsg_dump() when this code patch is used? Alternative solution would be to remove the check of kexec_crash_loaded() and always call panic_print_sys_info(false) at the beginning (after kgdb_panic(buf)). The advantage is that panic_print_sys_info(false) will be always called on the same location. It will give the same results in all code paths so that it will be easier to interpret them. And it will have the same problems so it should be easier to debug and maintain. It is possible that it will not work for some users. Also it is possible that it might cause some problems. But it is hard to guess at least for me. I think that we might try it and see if anyone complains. Honestly, I think that only few people use panic_printk_sys_info(). And your use-case makes sense. Best Regards, Petr ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH V2] panic: Move panic_print before kmsg dumpers
On 13/01/2022 08:50, Petr Mladek wrote: >> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...) >> * show some extra information on kernel log if it was set... >> */ >> if (kexec_crash_loaded()) >> -panic_print_sys_info(); >> +panic_print_sys_info(false); > > panic_print_sys_info(false) will be called twice when both > kexec_crash_loaded() and _crash_kexec_post_notifiers are true. > > Do we really need to call panic_print_sys_info() here? All information > provided by panic_print_sys_info(false) can be found also in > the crash dump. > >> /* >> * If we have crashed and we have a crash kernel loaded let it handle >> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...) >> */ >> atomic_notifier_call_chain(_notifier_list, 0, buf); >> >> +panic_print_sys_info(false); > > This is where the info might be printed 2nd time. > >> + >> kmsg_dump(KMSG_DUMP_PANIC); >> >> /* > > Otherwise, the change makes sense to me. > > Best Regards, > Petr Hi Petr, thanks for your great review! I see you also commented in the thread of the patch introducing the panic_print_sys_info() before kdump. Thanks for catching this issue - indeed, if "_crash_kexec_post_notifiers" is true, with this patch we print stuff twice. I will submit a V3 that guards against that, using a bool, makes sense to you? The interesting question here is: > Do we really need to call panic_print_sys_info() here? All information > provided by panic_print_sys_info(false) can be found also in > the crash dump. So, we indeed need that in our use case. Crash is meant to be used post-mortem, i.e., you made a full vmcore collection and then, of course, you have basically all the data you need accessible though the crash tool. Problem is: in our use case, we want more data than a regular dmesg in a panic event (hence we use panic_print), but we don't collect a full crash dump, due to its big size. Also, as you can imagine, we do favor pstore over kdump, but it might fail due to a variety of reasons (like not having a free RAM buffer for ramoops), so kdump is our fallback. Hence, we'd like to be able to use panic_print with both kdump and pstore, and for that, both patches are needed. Cheers, Guilherme ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH V2] panic: Move panic_print before kmsg dumpers
On Thu 2022-01-06 18:28:35, Guilherme G. Piccoli wrote: > The panic_print setting allows users to collect more information in a > panic event, like memory stats, tasks, CPUs backtraces, etc. > This is a pretty interesting debug mechanism, but currently the print > event happens *after* kmsg_dump(), meaning that pstore, for example, > cannot collect a dmesg with the panic_print information. > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -249,7 +252,7 @@ void panic(const char *fmt, ...) >* show some extra information on kernel log if it was set... >*/ > if (kexec_crash_loaded()) > - panic_print_sys_info(); > + panic_print_sys_info(false); panic_print_sys_info(false) will be called twice when both kexec_crash_loaded() and _crash_kexec_post_notifiers are true. Do we really need to call panic_print_sys_info() here? All information provided by panic_print_sys_info(false) can be found also in the crash dump. > /* >* If we have crashed and we have a crash kernel loaded let it handle > @@ -283,6 +286,8 @@ void panic(const char *fmt, ...) >*/ > atomic_notifier_call_chain(_notifier_list, 0, buf); > > + panic_print_sys_info(false); This is where the info might be printed 2nd time. > + > kmsg_dump(KMSG_DUMP_PANIC); > > /* Otherwise, the change makes sense to me. Best Regards, Petr ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec