RE: [RFC PATCH v2 0/3] pmem memmap dump support
Li Zhijian wrote: > Hello folks, > > About 2 months ago, we posted our first RFC[3] and received your kindly > feedback. Thank you :) > Now, I'm back with the code. > > Currently, this RFC has already implemented to supported case D*. And the > case A&B is disabled > deliberately in makedumpfile. It includes changes in 3 source code as below: I think the reason this patchkit is difficult to follow is that it spends a lot of time describing a chosen solution, but not enough time describing the problem and the tradeoffs. For example why is updating /proc/vmcore with pmem metadata the chosen solution? Why not leave the kernel out of it and have makedumpfile tooling aware of how to parse persistent memory namespace info-blocks and retrieve that dump itself? This is what I proposed here: http://lore.kernel.org/r/641484f7ef780_a52e2...@dwillia2-mobl3.amr.corp.intel.com.notmuch ...but never got an answer, or I missed the answer.
Re: [PATCH v21 5/7] x86/crash: add x86 crash hotplug support
On 28/04/23 2:55 pm, Baoquan He wrote: On 04/27/23 at 10:26pm, Hari Bathini wrote: On 27/04/23 2:19 pm, Baoquan He wrote: On 04/27/23 at 12:39pm, Hari Bathini wrote: Hi Eric, On 04/04/23 11:33 pm, Eric DeVolder wrote: When CPU or memory is hot un/plugged, or off/onlined, the crash elfcorehdr, which describes the CPUs and memory in the system, must also be updated. The segment containing the elfcorehdr is identified at run-time in crash_core:crash_handle_hotplug_event(), which works for both the kexec_load() and kexec_file_load() syscalls. A new elfcorehdr is generated from the available CPUs and memory into a buffer, and then installed over the top of the existing elfcorehdr. In the patch 'kexec: exclude elfcorehdr from the segment digest' the need to update purgatory due to the change in elfcorehdr was eliminated. As a result, no changes to purgatory or boot_params (as the elfcorehdr= kernel command line parameter pointer remains unchanged and correct) are needed, just elfcorehdr. To accommodate a growing number of resources via hotplug, the elfcorehdr segment must be sufficiently large enough to accommodate changes, see the CRASH_MAX_MEMORY_RANGES description. This is used only on the kexec_file_load() syscall; for kexec_load() userspace will need to size the segment similarly. To accommodate kexec_load() syscall in the absence of Firstly, thanks! This series is a nice improvement to kdump support in hotplug environment. One concern though is that this change assumes corresponding support in kexec-tools. Without that support kexec_load would fail to boot with digest verification failure, iiuc. Eric has posted patchset to modify kexec_tools to support that, please see the link Eric pasted in the cover letter. http://lists.infradead.org/pipermail/kexec/2022-October/026032.html Right, Baoquan. I did see that and if I read the code correctly, without that patchset kexec_load would fail. Not with an explicit error that hotplug support is missing or such but it would simply fail to boot into capture kernel with digest verification failure. My suggestion was to avoid that userspace tool breakage for older kexec-tools version by introducing a new kexec flag that can tell kernel that kexec-tools is ready to use this in-kernel update support. So, if kexec_load happens without the flag, avoid doing an in-kernel update on hotplug. I hope that clears the confusion. Yeah, sounds like a good idea. It may be extended in later patch. Fixing it in this series itself would be a cleaner way, I guess. Thanks Hari ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v21 5/7] x86/crash: add x86 crash hotplug support
On 04/27/23 at 10:26pm, Hari Bathini wrote: > On 27/04/23 2:19 pm, Baoquan He wrote: > > On 04/27/23 at 12:39pm, Hari Bathini wrote: > > > Hi Eric, > > > > > > On 04/04/23 11:33 pm, Eric DeVolder wrote: > > > > When CPU or memory is hot un/plugged, or off/onlined, the crash > > > > elfcorehdr, which describes the CPUs and memory in the system, > > > > must also be updated. > > > > > > > > The segment containing the elfcorehdr is identified at run-time > > > > in crash_core:crash_handle_hotplug_event(), which works for both > > > > the kexec_load() and kexec_file_load() syscalls. A new elfcorehdr > > > > is generated from the available CPUs and memory into a buffer, > > > > and then installed over the top of the existing elfcorehdr. > > > > > > > > In the patch 'kexec: exclude elfcorehdr from the segment digest' > > > > the need to update purgatory due to the change in elfcorehdr was > > > > eliminated. As a result, no changes to purgatory or boot_params > > > > (as the elfcorehdr= kernel command line parameter pointer > > > > remains unchanged and correct) are needed, just elfcorehdr. > > > > > > > > To accommodate a growing number of resources via hotplug, the > > > > elfcorehdr segment must be sufficiently large enough to accommodate > > > > changes, see the CRASH_MAX_MEMORY_RANGES description. This is used > > > > only on the kexec_file_load() syscall; for kexec_load() userspace > > > > will need to size the segment similarly. > > > > > > > > To accommodate kexec_load() syscall in the absence of > > > > > > Firstly, thanks! This series is a nice improvement to kdump support > > > in hotplug environment. > > > > > > One concern though is that this change assumes corresponding support > > > in kexec-tools. Without that support kexec_load would fail to boot > > > with digest verification failure, iiuc. > > > > Eric has posted patchset to modify kexec_tools to support that, please > > see the link Eric pasted in the cover letter. > > > > http://lists.infradead.org/pipermail/kexec/2022-October/026032.html > > Right, Baoquan. > > I did see that and if I read the code correctly, without that patchset > kexec_load would fail. Not with an explicit error that hotplug support > is missing or such but it would simply fail to boot into capture kernel > with digest verification failure. > > My suggestion was to avoid that userspace tool breakage for older > kexec-tools version by introducing a new kexec flag that can tell > kernel that kexec-tools is ready to use this in-kernel update support. > So, if kexec_load happens without the flag, avoid doing an in-kernel > update on hotplug. I hope that clears the confusion. Yeah, sounds like a good idea. It may be extended in later patch. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [RFC PATCH v2 3/3] resource, crash: Make kexec_file_load support pmem
Greg, Sorry for these *BAD* changelog, This patch is most like a *HACKing* to resource.c currently. Please allow me to rewrite it once. Only the region described by PT_LOADs of /proc/vmcore are dumpable/readble by dumping applications. Previously, on x86/x86_64 only system ram resources will be injected into PT_LOADs. So in order to make the entire pmem resource is dumpable/readable, we need to add pmem region into the PT_LOADs of /proc/vmcore. Here we introduce a new API walk_pmem_res() to walk the pmem region first. Further, we will also mark pmem region with extra p_flags PF_DEV when it's adding into PT_LOADs. Then the dumping applications are able to know if the region is pmem or not according this flag and take special actions correspondingly. Thanks Zhijian On 27/04/2023 19:39, Greg Kroah-Hartman wrote: > On Thu, Apr 27, 2023 at 06:18:34PM +0800, Li Zhijian wrote: >> It does: >> 1. Add pmem region into PT_LOADs of vmcore >> 2. Mark pmem region's p_flags as PF_DEV > > I'm sorry, but I can not parse this changelog. > > Please take a look at the kernel documentation for how to write a good > changelog message so that we can properly review the change you wish to > have accepted. > > thanks, > > greg k-h
Re: [RFC PATCH v2 3/3] resource, crash: Make kexec_file_load support pmem
Jane, On 28/04/2023 04:41, Jane Chu wrote: >> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c >> index cdd92ab43cda..dc9d03083565 100644 >> --- a/arch/x86/kernel/crash.c >> +++ b/arch/x86/kernel/crash.c >> @@ -178,6 +178,7 @@ static struct crash_mem *fill_up_crash_elf_data(void) >> if (!nr_ranges) >> return NULL; >> + walk_pmem_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback); > > So this will overwrite 'nr_ranges' produced by the previous > walk_system_ram_res() call, sure it's correct? It depends on how the callback walk_system_ram_res() handle 'nr_ranges', so it's safe for this changes IMHO. 163 static int get_nr_ram_ranges_callback(struct resource *res, void *arg) 164 { 165 unsigned int *nr_ranges = arg; 166 167 (*nr_ranges)++; 168 return 0; 169 } 170 171 /* Gather all the required information to prepare elf headers for ram regions */ 172 static struct crash_mem *fill_up_crash_elf_data(void) 173 { 174 unsigned int nr_ranges = 0; 175 struct crash_mem *cmem; 176 177 walk_system_ram_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback); 178 if (!nr_ranges) 179 return NULL; 180 181 walk_pmem_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback); At last, nr_ranges = #ram_res + #pmem_res. Thanks Zhijian > > Regards, > -jane
Re: [RFC PATCH v2 2/3] drivers/nvdimm: export memmap of namespace to vmcoreinfo
On 28/04/2023 06:50, Ira Weiny wrote: > Li Zhijian wrote: >> Each namespace has its own memmap, it will be udpated when >> namespace initializing/creating, updating, and deleting. >> >> CC: Dan Williams >> CC: Vishal Verma >> CC: Dave Jiang >> CC: Ira Weiny >> Signed-off-by: Li Zhijian >> --- >> drivers/nvdimm/namespace_devs.c | 2 ++ >> drivers/nvdimm/pfn_devs.c | 3 +++ >> 2 files changed, 5 insertions(+) >> >> diff --git a/drivers/nvdimm/namespace_devs.c >> b/drivers/nvdimm/namespace_devs.c >> index c60ec0b373c5..096203e6203f 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -8,6 +8,7 @@ >> #include >> #include >> #include >> +#include >> #include "nd-core.h" >> #include "pmem.h" >> #include "pfn.h" >> @@ -853,6 +854,7 @@ static ssize_t size_store(struct device *dev, >> if (rc == 0 && val == 0 && is_namespace_pmem(dev)) { >> struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev); >> >> +devm_memmap_vmcore_delete(to_ndns(dev)); > > This seems like an odd place to put this. Could you explain the reasoning > more? > Ira, Users who want to manage the namespace of pmem usually use the 'ndctl' command. The following cases would touch the memmap of the namespace. a. create namespace 'ndctl create-namespace -f -e namespace0.0 --mode=fsdax -s $(((1024+16)<<20)) -M dev' b. change namespace size 'ndctl create-namespace -f -e namespace0.0 --mode=fsdax -s $(((1024)<<20)) -M dev' c. change memmap location 'ndctl create-namespace -f -e namespace0.0 --mode=fsdax -s $(((1024+16)<<20)) -M mem' d. destroy namespace 'ndctl destroy-namespace -f namespace0.0' Unlike the former 3 cases, the case d, it will not invoke '__nvdimm_setup_pfn()'. Instead, ndctl just do something like 'echo 0 >/sys/bus/nd/devices/namespace0.0/size' We have to delete this namespace from devm_memmap_vmcore in this case. So here is an odd place but it works. I have tried to find a place pairing with __nvdimm_setup_pfn(), but i failed at last. If you have any good idea, please let me know :) Thanks Zhijian > Ira > >> kfree(nspm->uuid); >> nspm->uuid = NULL; >> } >> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c >> index af7d9301520c..80076996b2da 100644 >> --- a/drivers/nvdimm/pfn_devs.c >> +++ b/drivers/nvdimm/pfn_devs.c >> @@ -9,6 +9,7 @@ >> #include >> #include >> #include >> +#include >> #include "nd-core.h" >> #include "pfn.h" >> #include "nd.h" >> @@ -716,6 +717,8 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, >> struct dev_pagemap *pgmap) >> } else >> return -ENXIO; >> >> +devm_memmap_vmcore_update(ndns, altmap->base_pfn, PHYS_PFN(offset), >> + nd_pfn->mode == PFN_MODE_PMEM); >> return 0; >> } >> >> -- >> 2.29.2 >> > >