Re: [PATCH 0/2] kexec: accumulate and release the size of crashkernel
On 07/04/22 at 07:41pm, Kaihao Bai wrote: > Currently x86 and arm64 support to reserve low memory range for > crashkernel. When crashkernel=Y,low is defined, the main kernel would > reserve another memblock (instead of crashkernel=X,high, which stored > in crashk_res) for crashkernel and store it in crashk_low_res. > > The implementations of get_crash_size and crash_shrink_size do not > consider the extra reserved memory range if it exists. Thus, firstly > accumulate this range on the size of crashkernel and export the size > by /sys/kernel/kexec_crash_size. > > If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges > might be released if the new size is smaller than current size. The order > of release is (crashk_res -> crashk_low_res). Only if the new size defined > by the user is smaller than the size of low memory range, continue to > release the reserved low memory range after completely releasing the high > memory range. Sorry, I don't like this patchset. I bet you don't encounter a real problem in your product environment. Regarding crashkernel=,high|low, the ,low memory is for DMA and requirement from memory under lower range. The ,high meomry is for kernel/initrd loading, kernel data, user space program running. When you configure crashkernel= in your system, you need evaluate what value is suitable. /sys/kernel/kexec_crash_size is an interface you can make use of to tune the memory usage. People are not suggested to free all crashkernel reservation via the interface. So, please leave this as is, unless you have a real case where this change is needed. Thanks Baoquan ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 2/2] kexec: release reserved memory ranges to RAM if crashk_low_res defined
If reserving low memory range for crashkenrel, the range could not free to System RAM all the time. However, the high memory range corresponding to crashk_res can free to RAM through /sys/kernel/kexec_crash_size. If I write a smaller size to /sys/kernel/kexec_crash_size, the exceeded part of the new size would be released. To support releasing the low memory range, we should determine whether the new size is greater than the accumulated size. If not, the reserved high memory range will be released firstly. If the new size is smaller than the size of low memory range, we continue to release the reserved low memory range after completely releasing the high memory range. Signed-off-by: Kaihao Bai --- kernel/kexec_core.c | 75 +++-- 1 file changed, 56 insertions(+), 19 deletions(-) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 137f6eb..e89c171 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1031,12 +1031,42 @@ void __weak crash_free_reserved_phys_range(unsigned long begin, free_reserved_page(boot_pfn_to_page(addr >> PAGE_SHIFT)); } +static int __crash_shrink_memory(struct resource *crashkernel, +unsigned long start, unsigned long end) +{ + int ret = 0; + struct resource *ram_res; + + ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL); + if (!ram_res) { + ret = -ENOMEM; + return ret; + } + + crash_free_reserved_phys_range(end, crashkernel->end); + + if ((start == end) && (crashkernel->parent != NULL)) + release_resource(crashkernel); + + ram_res->start = end; + ram_res->end = crashk_res.end; + ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + ram_res->name = "System RAM"; + + crashkernel->end = end - 1; + + insert_resource(&iomem_resource, ram_res); + + return ret; +} + int crash_shrink_memory(unsigned long new_size) { int ret = 0; unsigned long start, end; + unsigned long low_start, low_end; unsigned long old_size; - struct resource *ram_res; + unsigned long low_old_size; mutex_lock(&kexec_mutex); @@ -1047,33 +1077,40 @@ int crash_shrink_memory(unsigned long new_size) start = crashk_res.start; end = crashk_res.end; old_size = (end == 0) ? 0 : end - start + 1; + low_start = crashk_low_res.start; + low_end = crashk_low_res.end; + low_old_size = (low_end == 0) ? 0 : low_end - low_start + 1; + old_size += low_old_size; + if (new_size >= old_size) { ret = (new_size == old_size) ? 0 : -EINVAL; goto unlock; } + if (start != end) { + start = roundup(start, KEXEC_CRASH_MEM_ALIGN); - ram_res = kzalloc(sizeof(*ram_res), GFP_KERNEL); - if (!ram_res) { - ret = -ENOMEM; - goto unlock; - } - - start = roundup(start, KEXEC_CRASH_MEM_ALIGN); - end = roundup(start + new_size, KEXEC_CRASH_MEM_ALIGN); - - crash_free_reserved_phys_range(end, crashk_res.end); + /* +* If the new_size is smaller than the reserved lower memory +* range of crashkernel, it releases all higher memory range. +* Otherwise it releases part of higher range. +*/ + end = (new_size <= low_old_size) ? + roundup(start, KEXEC_CRASH_MEM_ALIGN) : + roundup(start + new_size - low_old_size, + KEXEC_CRASH_MEM_ALIGN); - if ((start == end) && (crashk_res.parent != NULL)) - release_resource(&crashk_res); + ret = __crash_shrink_memory(&crashk_res, start, end); - ram_res->start = end; - ram_res->end = crashk_res.end; - ram_res->flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; - ram_res->name = "System RAM"; + if (ret) + goto unlock; + } - crashk_res.end = end - 1; + if (new_size < low_old_size) { + low_start = roundup(low_start, KEXEC_CRASH_MEM_ALIGN); + low_end = roundup(low_start + new_size, KEXEC_CRASH_MEM_ALIGN); - insert_resource(&iomem_resource, ram_res); + ret = __crash_shrink_memory(&crashk_low_res, low_start, low_end); + } unlock: mutex_unlock(&kexec_mutex); -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 0/2] kexec: accumulate and release the size of crashkernel
Currently x86 and arm64 support to reserve low memory range for crashkernel. When crashkernel=Y,low is defined, the main kernel would reserve another memblock (instead of crashkernel=X,high, which stored in crashk_res) for crashkernel and store it in crashk_low_res. The implementations of get_crash_size and crash_shrink_size do not consider the extra reserved memory range if it exists. Thus, firstly accumulate this range on the size of crashkernel and export the size by /sys/kernel/kexec_crash_size. If getting the input of /sys/kernel/kexec_crash_size, both reserved ranges might be released if the new size is smaller than current size. The order of release is (crashk_res -> crashk_low_res). Only if the new size defined by the user is smaller than the size of low memory range, continue to release the reserved low memory range after completely releasing the high memory range. Kaihao Bai (2): kexec: accumulate kexec_crash_size if crashk_low_res defined kexec: release reserved memory ranges to RAM if crashk_low_res defined kernel/kexec_core.c | 77 - 1 file changed, 58 insertions(+), 19 deletions(-) -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 1/2] kexec: accumulate kexec_crash_size if crashk_low_res defined
Currently x86 and arm64 support to reserve low memory range for crashkernel. When crashkernel=Y,low is defined, the main kernel would reserve another memblock (instead of crashkernel=X,high, which stored in crashk_res) for crashkernel and store it in crashk_low_res. But the value of /sys/kernel/kexec_crash_size only calculates the size of crashk_res size is not calculated. To ensure the consistency of /sys/kernel/kexec_crash_size, when crashk_low_res is defined, its size needs to be accumulated to kexec_crash_size. Signed-off-by: Kaihao Bai --- kernel/kexec_core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 4d34c78..137f6eb 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1016,6 +1016,8 @@ size_t crash_get_memory_size(void) mutex_lock(&kexec_mutex); if (crashk_res.end != crashk_res.start) size = resource_size(&crashk_res); + if (crashk_low_res.end != crashk_low_res.start) + size += resource_size(&crashk_low_res); mutex_unlock(&kexec_mutex); return size; } -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v2] proc/vmcore: fix potential memory leak in vmcore_init()
elfcorehdr_alloc() allocates a memory chunk for elfcorehdr_addr with kzalloc(). If is_vmcore_usable() returns false, elfcorehdr_addr is a predefined value. If parse_crash_elf_headers() occurs some error and returns a negetive value, the elfcorehdr_addr should be released with elfcorehdr_free(). We can fix by calling elfcorehdr_free() when parse_crash_elf_headers() fails. Signed-off-by: Jianglei Nie --- fs/proc/vmcore.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c index 4eaeb645e759..86887bd90263 100644 --- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -1569,7 +1569,7 @@ static int __init vmcore_init(void) rc = parse_crash_elf_headers(); if (rc) { pr_warn("Kdump: vmcore not initialized\n"); - return rc; + goto fail; } elfcorehdr_free(elfcorehdr_addr); elfcorehdr_addr = ELFCORE_ADDR_ERR; @@ -1577,6 +1577,9 @@ static int __init vmcore_init(void) proc_vmcore = proc_create("vmcore", S_IRUSR, NULL, &vmcore_proc_ops); if (proc_vmcore) proc_vmcore->size = vmcore_size; + +fail: + elfcorehdr_free(elfcorehdr_addr); return 0; } fs_initcall(vmcore_init); -- 2.25.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCHv7 11/14] x86: Disable kexec if system has unaccepted memory
On Wed, 29 Jun 2022 at 08:59, Kirill A. Shutemov wrote: > > On Tue, Jun 28, 2022 at 05:10:56PM -0700, Dave Hansen wrote: > > On 6/28/22 16:51, Kirill A. Shutemov wrote: > > > On Fri, Jun 24, 2022 at 05:00:05AM +0300, Kirill A. Shutemov wrote: > > >>> If there is some deep and fundamental why this can not be supported > > >>> then it probably makes sense to put some code in the arch_kexec_load > > >>> hook that verifies that deep and fundamental reason is present. > > ... > > > + /* > > > +* TODO: Information on memory acceptance status has to be > > > communicated > > > +* between kernel. > > > +*/ > > > > So, the deep and fundamental reason is... drum roll... you haven't > > gotten around to implementing bitmap passing yet?!?!? I have the > > feeling that wasn't what Eric was looking for. > > The deep fundamental reason is that everything cannot be implemented and > upstreamed at once. If the only thing is to pass bitmap to kexec kernel, since you have reserved the bitmap memory I guess it is straightforward to set the kexec bootparams.unaccepted_memory as the old value. Not sure if there are problems when the decompress code accepts memory again though. for kernel kexec_file_load, refer to function setup_boot_parameters() in arch/x86/kernel/kexec-bzimage64.c for kexec_file_load, for kexec-tools kexec_load code refer to setup_linux_system_parameters() kexec/arch/i386/x86-linux-setup.c Thanks Dave ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec