Re: [PATCH 0/2] kexec: accumulate and release the size of crashkernel

2022-07-04 Thread Baoquan He
On 07/04/22 at 07:41pm, Kaihao Bai wrote:
> Currently x86 and arm64 support to reserve low memory range for
> crashkernel. When crashkernel=Y,low is defined, the main kernel would
> reserve another memblock (instead of crashkernel=X,high, which stored
> in crashk_res) for crashkernel and store it in crashk_low_res.
> 
> The implementations of get_crash_size and crash_shrink_size do not
> consider the extra reserved memory range if it exists. Thus, firstly
> accumulate this range on the size of crashkernel and export the size 
> by /sys/kernel/kexec_crash_size.
> 
> If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges
> might be released if the new size is smaller than current size. The order
> of release is (crashk_res -> crashk_low_res). Only if the new size defined
> by the user is smaller than the size of low memory range, continue to 
> release the reserved low memory range after completely releasing the high 
> memory range.

Sorry, I don't like this patchset.

I bet you don't encounter a real problem in your product environment.
Regarding crashkernel=,high|low, the ,low memory is for DMA and
requirement from memory under lower range. The ,high meomry is for
kernel/initrd loading, kernel data, user space program running. When
you configure crashkernel= in your system, you need evaluate what
value is suitable. /sys/kernel/kexec_crash_size is an interface you
can make use of to tune the memory usage. People are not suggested to
free all crashkernel reservation via the interface. 

So, please leave this as is, unless you have a real case where this
change is needed.

Thanks
Baoquan


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 2/2] kexec: release reserved memory ranges to RAM if crashk_low_res defined

2022-07-04 Thread Kaihao Bai
If reserving low memory range for crashkenrel, the range could not free
to System RAM all the time. However, the high memory range corresponding
to crashk_res can free to RAM through /sys/kernel/kexec_crash_size. If I
write a smaller size to /sys/kernel/kexec_crash_size,  the exceeded part
of  the new size would be released.

To support releasing the low memory range,  we should determine whether
the new size is greater than the accumulated size. If not, the reserved
high memory range will be released firstly. If the new size is smaller
than the size of low memory range, we continue to release the reserved
low memory range after completely releasing the high memory range.

Signed-off-by: Kaihao Bai 
---
 kernel/kexec_core.c | 75 +++--
 1 file changed, 56 insertions(+), 19 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 137f6eb..e89c171 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1031,12 +1031,42 @@ void __weak crash_free_reserved_phys_range(unsigned 
long begin,
free_reserved_page(boot_pfn_to_page(addr >> PAGE_SHIFT));
 }
 
+static int __crash_shrink_memory(struct resource *crashkernel,
+unsigned long start, unsigned long end)
+{
+   int ret = 0;
+   struct resource *ram_res;
+
+   ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL);
+   if (!ram_res) {
+   ret = -ENOMEM;
+   return ret;
+   }
+
+   crash_free_reserved_phys_range(end, crashkernel->end);
+
+   if ((start == end) && (crashkernel->parent != NULL))
+   release_resource(crashkernel);
+
+   ram_res->start = end;
+   ram_res->end = crashk_res.end;
+   ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM;
+   ram_res->name = "System RAM";
+
+   crashkernel->end = end - 1;
+
+   insert_resource(&iomem_resource, ram_res);
+
+   return ret;
+}
+
 int crash_shrink_memory(unsigned long new_size)
 {
int ret = 0;
unsigned long start, end;
+   unsigned long low_start, low_end;
unsigned long old_size;
-   struct resource *ram_res;
+   unsigned long low_old_size;
 
mutex_lock(&kexec_mutex);
 
@@ -1047,33 +1077,40 @@ int crash_shrink_memory(unsigned long new_size)
start = crashk_res.start;
end = crashk_res.end;
old_size = (end == 0) ? 0 : end - start + 1;
+   low_start = crashk_low_res.start;
+   low_end = crashk_low_res.end;
+   low_old_size = (low_end == 0) ? 0 : low_end - low_start + 1;
+   old_size += low_old_size;
+
if (new_size >= old_size) {
ret = (new_size == old_size) ? 0 : -EINVAL;
goto unlock;
}
+   if (start != end) {
+   start = roundup(start, KEXEC_CRASH_MEM_ALIGN);
 
-   ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL);
-   if (!ram_res) {
-   ret = -ENOMEM;
-   goto unlock;
-   }
-
-   start = roundup(start, KEXEC_CRASH_MEM_ALIGN);
-   end = roundup(start + new_size, KEXEC_CRASH_MEM_ALIGN);
-
-   crash_free_reserved_phys_range(end, crashk_res.end);
+   /*
+* If the new_size is smaller than the reserved lower memory
+* range of crashkernel, it releases all higher memory range.
+* Otherwise it releases part of higher range.
+*/
+   end = (new_size <= low_old_size) ?
+   roundup(start, KEXEC_CRASH_MEM_ALIGN) :
+   roundup(start + new_size - low_old_size,
+   KEXEC_CRASH_MEM_ALIGN);
 
-   if ((start == end) && (crashk_res.parent != NULL))
-   release_resource(&crashk_res);
+   ret = __crash_shrink_memory(&crashk_res, start, end);
 
-   ram_res->start = end;
-   ram_res->end = crashk_res.end;
-   ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM;
-   ram_res->name = "System RAM";
+   if (ret)
+   goto unlock;
+   }
 
-   crashk_res.end = end - 1;
+   if (new_size < low_old_size) {
+   low_start = roundup(low_start, KEXEC_CRASH_MEM_ALIGN);
+   low_end = roundup(low_start + new_size, KEXEC_CRASH_MEM_ALIGN);
 
-   insert_resource(&iomem_resource, ram_res);
+   ret = __crash_shrink_memory(&crashk_low_res, low_start, 
low_end);
+   }
 
 unlock:
mutex_unlock(&kexec_mutex);
-- 
1.8.3.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 0/2] kexec: accumulate and release the size of crashkernel

2022-07-04 Thread Kaihao Bai
Currently x86 and arm64 support to reserve low memory range for
crashkernel. When crashkernel=Y,low is defined, the main kernel would
reserve another memblock (instead of crashkernel=X,high, which stored
in crashk_res) for crashkernel and store it in crashk_low_res.

The implementations of get_crash_size and crash_shrink_size do not
consider the extra reserved memory range if it exists. Thus, firstly
accumulate this range on the size of crashkernel and export the size 
by /sys/kernel/kexec_crash_size.

If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges
might be released if the new size is smaller than current size. The order
of release is (crashk_res -> crashk_low_res). Only if the new size defined
by the user is smaller than the size of low memory range, continue to 
release the reserved low memory range after completely releasing the high 
memory range.

Kaihao Bai (2):
  kexec: accumulate kexec_crash_size if crashk_low_res defined
  kexec: release reserved memory ranges to RAM if crashk_low_res defined

 kernel/kexec_core.c | 77 -
 1 file changed, 58 insertions(+), 19 deletions(-)

-- 
1.8.3.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 1/2] kexec: accumulate kexec_crash_size if crashk_low_res defined

2022-07-04 Thread Kaihao Bai
Currently x86 and arm64 support to reserve low memory range for
crashkernel. When crashkernel=Y,low is defined, the main kernel would
reserve another memblock (instead of crashkernel=X,high, which stored
in crashk_res) for crashkernel and store it in crashk_low_res. But
the value of /sys/kernel/kexec_crash_size only calculates the size of
crashk_res size is not calculated.

To ensure the consistency of /sys/kernel/kexec_crash_size, when
crashk_low_res is defined, its size needs to be accumulated to
kexec_crash_size.

Signed-off-by: Kaihao Bai 
---
 kernel/kexec_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 4d34c78..137f6eb 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1016,6 +1016,8 @@ size_t crash_get_memory_size(void)
mutex_lock(&kexec_mutex);
if (crashk_res.end != crashk_res.start)
size = resource_size(&crashk_res);
+   if (crashk_low_res.end != crashk_low_res.start)
+   size += resource_size(&crashk_low_res);
mutex_unlock(&kexec_mutex);
return size;
 }
-- 
1.8.3.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2] proc/vmcore: fix potential memory leak in vmcore_init()

2022-07-04 Thread Jianglei Nie
elfcorehdr_alloc() allocates a memory chunk for elfcorehdr_addr with
kzalloc(). If is_vmcore_usable() returns false, elfcorehdr_addr is a
predefined value. If parse_crash_elf_headers() occurs some error and
returns a negetive value, the elfcorehdr_addr should be released with
elfcorehdr_free().

We can fix by calling elfcorehdr_free() when parse_crash_elf_headers()
fails.

Signed-off-by: Jianglei Nie 
---
 fs/proc/vmcore.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index 4eaeb645e759..86887bd90263 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -1569,7 +1569,7 @@ static int __init vmcore_init(void)
rc = parse_crash_elf_headers();
if (rc) {
pr_warn("Kdump: vmcore not initialized\n");
-   return rc;
+   goto fail;
}
elfcorehdr_free(elfcorehdr_addr);
elfcorehdr_addr = ELFCORE_ADDR_ERR;
@@ -1577,6 +1577,9 @@ static int __init vmcore_init(void)
proc_vmcore = proc_create("vmcore", S_IRUSR, NULL, &vmcore_proc_ops);
if (proc_vmcore)
proc_vmcore->size = vmcore_size;
+
+fail:
+   elfcorehdr_free(elfcorehdr_addr);
return 0;
 }
 fs_initcall(vmcore_init);
-- 
2.25.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCHv7 11/14] x86: Disable kexec if system has unaccepted memory

2022-07-04 Thread Dave Young
On Wed, 29 Jun 2022 at 08:59, Kirill A. Shutemov
 wrote:
>
> On Tue, Jun 28, 2022 at 05:10:56PM -0700, Dave Hansen wrote:
> > On 6/28/22 16:51, Kirill A. Shutemov wrote:
> > > On Fri, Jun 24, 2022 at 05:00:05AM +0300, Kirill A. Shutemov wrote:
> > >>> If there is some deep and fundamental why this can not be supported
> > >>> then it probably makes sense to put some code in the arch_kexec_load
> > >>> hook that verifies that deep and fundamental reason is present.
> > ...
> > > +   /*
> > > +* TODO: Information on memory acceptance status has to be 
> > > communicated
> > > +* between kernel.
> > > +*/
> >
> > So, the deep and fundamental reason is... drum roll... you haven't
> > gotten around to implementing bitmap passing yet?!?!?   I have the
> > feeling that wasn't what Eric was looking for.
>
> The deep fundamental reason is that everything cannot be implemented and
> upstreamed at once.

If the only thing is to pass bitmap to kexec kernel, since you have
reserved the bitmap memory I guess it is straightforward to set the
kexec bootparams.unaccepted_memory as the old value.  Not sure if
there are problems when the decompress code accepts memory again
though.
for kernel kexec_file_load, refer to function setup_boot_parameters()
in arch/x86/kernel/kexec-bzimage64.c for kexec_file_load,
for kexec-tools kexec_load code refer to
setup_linux_system_parameters() kexec/arch/i386/x86-linux-setup.c

Thanks
Dave


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec