Re: [PATCH 1/3] memblock: define functions to set the usable memory range

2022-01-13 Thread Frank van der Linden
On Fri, Jan 14, 2022 at 12:10:46AM +, Frank van der Linden wrote:
> Thanks for discussing this further! I tried the above change, although
> I wrapped it in an if (cap_mem_size != 0), so that a normal kernel
> doesn't get its entire memory marked as reserved.

Just to clarify: I had to do that because the code is slightly different
on 5.15, where I tried it. It wasn't a bug in the proposed patch.

Anyway, will try 5.17-rc tomorrow, I'll let you know, thanks again.

- Frank

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/3] memblock: define functions to set the usable memory range

2022-01-13 Thread Frank van der Linden
On Thu, Jan 13, 2022 at 07:33:11PM +0200, Mike Rapoport wrote:
> On Tue, Jan 11, 2022 at 08:44:41PM +, Frank van der Linden wrote:
> > On Tue, Jan 11, 2022 at 12:31:58PM +0200, Mike Rapoport wrote:
> > > > --- a/include/linux/memblock.h
> > > > +++ b/include/linux/memblock.h
> > > > @@ -481,6 +481,8 @@ phys_addr_t memblock_reserved_size(void);
> > > >  phys_addr_t memblock_start_of_DRAM(void);
> > > >  phys_addr_t memblock_end_of_DRAM(void);
> > > >  void memblock_enforce_memory_limit(phys_addr_t memory_limit);
> > > > +void memblock_set_usable_range(phys_addr_t base, phys_addr_t size);
> > > > +void memblock_enforce_usable_range(void);
> > > >  void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
> > > >  void memblock_mem_limit_remove_map(phys_addr_t limit);
> > >
> > > We already have 3 very similar interfaces that deal with memory capping.
> > > Now you suggest to add fourth that will "generically" solve a single use
> > > case of DT, EFI and kdump interaction on arm64.
> > >
> > > Looks like a workaround for a fundamental issue of incompatibility between
> > > DT and EFI wrt memory registration.
> >
> > Yep, I figured this would be the main argument against this - arm64
> > already added several other more-or-less special cased interfaces over
> > time.
> >
> > I'm more than happy to solve this in a different way.
> >
> > What would you suggest:
> >
> > 1) Try to merge the similar interfaces in to one.
> > 2) Just deal with it at a lower (arm64) level?
> > 3) Some other way?
> 
> We've discussed this with Ard on IRC, and our conclusion was that on arm64
> kdump kernel should have memblock.memory exactly the same as the normal
> kernel. Then, the memory outside usable-memory-range should be reserved so
> that kdump kernel won't step over it.
> 
> With that, simple (untested) patch below could be what we need:
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index bdca35284ceb..371418dffaf1 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -1275,7 +1275,8 @@ void __init early_init_dt_scan_nodes(void)
> of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> 
> /* Handle linux,usable-memory-range property */
> -   memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
> +   memblock_reserve(0, cap_mem_addr);
> +   memblock_reserve(cap_mem_addr + cap_mem_size, PHYS_ADDR_MAX);
>  }
> 
>  bool __init early_init_dt_scan(void *params)

Thanks for discussing this further! I tried the above change, although
I wrapped it in an if (cap_mem_size != 0), so that a normal kernel
doesn't get its entire memory marked as reserved.

The crash kernel hung without producing output - I haven't chased that
down. This particular kernel was a bit older (5.15), so I'll try again
with 5.17-rc to make sure.

- Frank

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/3] memblock: define functions to set the usable memory range

2022-01-13 Thread Mike Rapoport
On Tue, Jan 11, 2022 at 08:44:41PM +, Frank van der Linden wrote:
> On Tue, Jan 11, 2022 at 12:31:58PM +0200, Mike Rapoport wrote:
> > > --- a/include/linux/memblock.h
> > > +++ b/include/linux/memblock.h
> > > @@ -481,6 +481,8 @@ phys_addr_t memblock_reserved_size(void);
> > >  phys_addr_t memblock_start_of_DRAM(void);
> > >  phys_addr_t memblock_end_of_DRAM(void);
> > >  void memblock_enforce_memory_limit(phys_addr_t memory_limit);
> > > +void memblock_set_usable_range(phys_addr_t base, phys_addr_t size);
> > > +void memblock_enforce_usable_range(void);
> > >  void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
> > >  void memblock_mem_limit_remove_map(phys_addr_t limit);
> > 
> > We already have 3 very similar interfaces that deal with memory capping.
> > Now you suggest to add fourth that will "generically" solve a single use
> > case of DT, EFI and kdump interaction on arm64.
> > 
> > Looks like a workaround for a fundamental issue of incompatibility between
> > DT and EFI wrt memory registration.
> 
> Yep, I figured this would be the main argument against this - arm64
> already added several other more-or-less special cased interfaces over
> time.
> 
> I'm more than happy to solve this in a different way.
> 
> What would you suggest:
> 
> 1) Try to merge the similar interfaces in to one.
> 2) Just deal with it at a lower (arm64) level?
> 3) Some other way?

We've discussed this with Ard on IRC, and our conclusion was that on arm64
kdump kernel should have memblock.memory exactly the same as the normal
kernel. Then, the memory outside usable-memory-range should be reserved so
that kdump kernel won't step over it.

With that, simple (untested) patch below could be what we need:

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index bdca35284ceb..371418dffaf1 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -1275,7 +1275,8 @@ void __init early_init_dt_scan_nodes(void)
of_scan_flat_dt(early_init_dt_scan_memory, NULL);
 
/* Handle linux,usable-memory-range property */
-   memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+   memblock_reserve(0, cap_mem_addr);
+   memblock_reserve(cap_mem_addr + cap_mem_size, PHYS_ADDR_MAX);
 }
 
 bool __init early_init_dt_scan(void *params)

> Thanks,
> 
> - Frank
> 

-- 
Sincerely yours,
Mike.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V2] panic: Move panic_print before kmsg dumpers

2022-01-13 Thread Guilherme G. Piccoli
On 13/01/2022 11:22, Petr Mladek wrote:
> [...]
> OK, do we have any specific reason why panic_print_sys_info()
> should get called right before kmsg_dump() when this code patch
> is used?
> 
> Alternative solution would be to remove the check of
> kexec_crash_loaded() and always call panic_print_sys_info(false)
> at the beginning (after kgdb_panic(buf)).
> 
> The advantage is that panic_print_sys_info(false) will be always
> called on the same location. It will give the same results
> in all code paths so that it will be easier to interpret them.
> And it will have the same problems so it should be easier
> to debug and maintain.
> 
> It is possible that it will not work for some users. Also it is
> possible that it might cause some problems. But it is hard to
> guess at least for me.
> 
> I think that we might try it and see if anyone complains.
> Honestly, I think that only few people use panic_printk_sys_info().
> And your use-case makes sense.
> 
> Best Regards,
> Petr

Hi Petr, thanks for your idea - it's simple and effective, would resolve
the problems in a straightforward way. But there are some cons, let me
detail more.

Currently (in linux-next), if users set panic_print but not kdump, the
panic_print_sys_info() is called *after* the panic notifiers. If user
has kdump configured and still sets panic_print (which is our use case),
then panic_print_sys_info() is called _before_ the panic notifiers. In
other words, the behavior for non-kdump users is the same as before,
since the merge of panic_print functionality.

Your idea brings the printing earlier, always before the panic
notifiers. This isn't terrible, but might lead to unfortunate and
hard-to-debug problems; for example, some panic notifiers are
rcu_panic() and hung_task_panic(), both are simple functions to disable
RCU warnings and hung task detector in panic scenarios. If we go with
your idea, these won't get called before panic_print_sys_info(), even if
kdump is not set. So, since the panic_print triggers a lot of printing
in the console, we could face a stall and trigger RCU messages, maybe
intermixed with the panic_print information.

Again, this is not the end of the world, we could live with that. But
keeping two calls - and guarding against double print using
kexec_crash_loaded() as per your idea - is completely non-invasive and
doesn't change the current behavior.

So, let me know if we should go with your idea or keep both calls but
guarding against double printing to keep the current behavior untouched.
Both ways work for me, I'm slightly inclined to the latter, but I can
rework the patch if you prefer and use your idea!
Cheers,


Guilherme

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V2] panic: Move panic_print before kmsg dumpers

2022-01-13 Thread Petr Mladek
On Thu 2022-01-13 09:34:01, Guilherme G. Piccoli wrote:
> On 13/01/2022 08:50, Petr Mladek wrote:
> >> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
> >> * show some extra information on kernel log if it was set...
> >> */
> >>if (kexec_crash_loaded())
> >> -  panic_print_sys_info();
> >> +  panic_print_sys_info(false);
> > 
> > panic_print_sys_info(false) will be called twice when both
> > kexec_crash_loaded() and _crash_kexec_post_notifiers are true.
> > 
> > Do we really need to call panic_print_sys_info() here? All information
> > provided by panic_print_sys_info(false) can be found also in
> > the crash dump.
> > 
> >>/*
> >> * If we have crashed and we have a crash kernel loaded let it handle
> >> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
> >> */
> >>atomic_notifier_call_chain(_notifier_list, 0, buf);
> >>  
> >> +  panic_print_sys_info(false);
> > 
> > This is where the info might be printed 2nd time.
> > 
> >> +
> >>kmsg_dump(KMSG_DUMP_PANIC);

> Thanks for catching this issue - indeed, if
> "_crash_kexec_post_notifiers" is true, with this patch we print stuff
> twice. I will submit a V3 that guards against that, using a bool, makes
> sense to you?

It might be possible to check kexec_crash_loaded() on the two
locations. But I think about even easier solution, see below.


> The interesting question here is:
> > Do we really need to call panic_print_sys_info() here? All information
> > provided by panic_print_sys_info(false) can be found also in
> > the crash dump.
> 
> So, we indeed need that in our use case. Crash is meant to be used
> post-mortem, i.e., you made a full vmcore collection and then, of
> course, you have basically all the data you need accessible though the
> crash tool.
> 
> Problem is: in our use case, we want more data than a regular dmesg in a
> panic event (hence we use panic_print), but we don't collect a full
> crash dump, due to its big size. Also, as you can imagine, we do favor
> pstore over kdump, but it might fail due to a variety of reasons (like
> not having a free RAM buffer for ramoops), so kdump is our fallback.
> Hence, we'd like to be able to use panic_print with both kdump and
> pstore, and for that, both patches are needed.

Fair enough.

OK, do we have any specific reason why panic_print_sys_info()
should get called right before kmsg_dump() when this code patch
is used?

Alternative solution would be to remove the check of
kexec_crash_loaded() and always call panic_print_sys_info(false)
at the beginning (after kgdb_panic(buf)).

The advantage is that panic_print_sys_info(false) will be always
called on the same location. It will give the same results
in all code paths so that it will be easier to interpret them.
And it will have the same problems so it should be easier
to debug and maintain.

It is possible that it will not work for some users. Also it is
possible that it might cause some problems. But it is hard to
guess at least for me.

I think that we might try it and see if anyone complains.
Honestly, I think that only few people use panic_printk_sys_info().
And your use-case makes sense.

Best Regards,
Petr

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V2] panic: Move panic_print before kmsg dumpers

2022-01-13 Thread Guilherme G. Piccoli
On 13/01/2022 08:50, Petr Mladek wrote:
>> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
>>   * show some extra information on kernel log if it was set...
>>   */
>>  if (kexec_crash_loaded())
>> -panic_print_sys_info();
>> +panic_print_sys_info(false);
> 
> panic_print_sys_info(false) will be called twice when both
> kexec_crash_loaded() and _crash_kexec_post_notifiers are true.
> 
> Do we really need to call panic_print_sys_info() here? All information
> provided by panic_print_sys_info(false) can be found also in
> the crash dump.
> 
>>  /*
>>   * If we have crashed and we have a crash kernel loaded let it handle
>> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
>>   */
>>  atomic_notifier_call_chain(_notifier_list, 0, buf);
>>  
>> +panic_print_sys_info(false);
> 
> This is where the info might be printed 2nd time.
> 
>> +
>>  kmsg_dump(KMSG_DUMP_PANIC);
>>  
>>  /*
> 
> Otherwise, the change makes sense to me.
> 
> Best Regards,
> Petr

Hi Petr, thanks for your great review!
I see you also commented in the thread of the patch introducing the
panic_print_sys_info() before kdump.

Thanks for catching this issue - indeed, if
"_crash_kexec_post_notifiers" is true, with this patch we print stuff
twice. I will submit a V3 that guards against that, using a bool, makes
sense to you?

The interesting question here is:
> Do we really need to call panic_print_sys_info() here? All information
> provided by panic_print_sys_info(false) can be found also in
> the crash dump.

So, we indeed need that in our use case. Crash is meant to be used
post-mortem, i.e., you made a full vmcore collection and then, of
course, you have basically all the data you need accessible though the
crash tool.

Problem is: in our use case, we want more data than a regular dmesg in a
panic event (hence we use panic_print), but we don't collect a full
crash dump, due to its big size. Also, as you can imagine, we do favor
pstore over kdump, but it might fail due to a variety of reasons (like
not having a free RAM buffer for ramoops), so kdump is our fallback.
Hence, we'd like to be able to use panic_print with both kdump and
pstore, and for that, both patches are needed.

Cheers,


Guilherme

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH V2] panic: Move panic_print before kmsg dumpers

2022-01-13 Thread Petr Mladek
On Thu 2022-01-06 18:28:35, Guilherme G. Piccoli wrote:
> The panic_print setting allows users to collect more information in a
> panic event, like memory stats, tasks, CPUs backtraces, etc.
> This is a pretty interesting debug mechanism, but currently the print
> event happens *after* kmsg_dump(), meaning that pstore, for example,
> cannot collect a dmesg with the panic_print information.

> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
>* show some extra information on kernel log if it was set...
>*/
>   if (kexec_crash_loaded())
> - panic_print_sys_info();
> + panic_print_sys_info(false);

panic_print_sys_info(false) will be called twice when both
kexec_crash_loaded() and _crash_kexec_post_notifiers are true.

Do we really need to call panic_print_sys_info() here? All information
provided by panic_print_sys_info(false) can be found also in
the crash dump.

>   /*
>* If we have crashed and we have a crash kernel loaded let it handle
> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
>*/
>   atomic_notifier_call_chain(_notifier_list, 0, buf);
>  
> + panic_print_sys_info(false);

This is where the info might be printed 2nd time.

> +
>   kmsg_dump(KMSG_DUMP_PANIC);
>  
>   /*

Otherwise, the change makes sense to me.

Best Regards,
Petr

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec