Re: [PATCH RESEND v2 0/5] Avoid requesting page from DMA zone when no managed pages

2021-12-07 Thread John Donnelly

On 12/7/21 22:33, Andrew Morton wrote:

On Mon, 6 Dec 2021 22:03:59 -0600 John Donnelly  
wrote:


On 12/6/21 9:16 PM, Baoquan He wrote:

Sorry, forgot adding x86 and x86/mm maintainers


Hi,

These commits need applied to Linux-5.15.0 (LTS) too since it has the
original regression :

   1d659236fb43 ("dma-pool: scale the default DMA coherent pool
size with memory capacity")

Maybe add "Fixes" to the other commits ?


And cc:stable, please.  "Fixes:" doesn't always mean "should be
backported



Hi.


Does this mean we need a v3 version ?



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RESEND v2 0/5] Avoid requesting page from DMA zone when no managed pages

2021-12-07 Thread Andrew Morton
On Mon, 6 Dec 2021 22:03:59 -0600 John Donnelly  
wrote:

> On 12/6/21 9:16 PM, Baoquan He wrote:
> > Sorry, forgot adding x86 and x86/mm maintainers
> 
> Hi,
> 
>These commits need applied to Linux-5.15.0 (LTS) too since it has the 
> original regression :
> 
>   1d659236fb43 ("dma-pool: scale the default DMA coherent pool
> size with memory capacity")
> 
> Maybe add "Fixes" to the other commits ?

And cc:stable, please.  "Fixes:" doesn't always mean "should be
backported".

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC v1 0/8] RFC v1: Kernel handling of CPU and memory hot un/plug for crash

2021-12-07 Thread Eric DeVolder

See below, thanks, eric

On 12/1/21 06:59, Baoquan He wrote:

+ akpm

On 11/29/21 at 01:42pm, Eric DeVolder wrote:

Hi, see below.
eric

On 11/24/21 03:02, Baoquan He wrote:

Hi,

On 11/18/21 at 12:49pm, Eric DeVolder wrote:
..

This patchset introduces a generic crash hot un/plug handler that
registers with the CPU and memory notifiers. Upon CPU or memory
changes, this generic handler is invoked and performs important
housekeeping, for example obtaining the appropriate lock, and then
invokes an architecture specific handler to do the appropriate
updates.

In the case of x86_64, the arch specific handler generates a new
elfcorehdr, which reflects the current CPUs and memory regions, into a
buffer. Since purgatory also does an integrity check via hash digests
of the loaded segments, purgatory must also be updated with the new


When I tried to address this with a draft patch, I started with a
different way in which udev rule triggers reloading and only elfcorehdr
segment is updated. The update should be less time consuming. Seems
internal notifier is better in your way. But I didn't update purgatory
since I just skipped the elfcorehdr part when calculate the digest of
segments. The reason from my mind is kernel text, initrd must contribute
most part of the digest, elfcorehdr is much less, and it will simplify
code change more. Doing so let us have no need to touch purgatory at
all. What do you think?


Well certainly if purgatory did not need to be updated, then that simplifies
matters quite a bit!

I do not have any context on the history of including elfcorehdr in the 
purgatory
integrity check. I do agree with you that checking kernel, initrd, boot_params
is most important. Perhaps allowing the elfcorehdr data structure to change
isn't too bad without including in the integrity check is ok as there is some
sanity checking of it by the capture kernel as it reads it for /proc/vmcore 
setup.


Well, I think the check has included elfcorehdr since user space
kexec-tools added the check. We can do the skipping in kexec_file load
in kernel for the time being, see if anyone has concern about the
safety or security. Since agreement has been reached, can you split out
the purgatory update and repost a new patchset with the current
frame work to only update elfcorehdr?


I reworked the patchset as you suggested and removed the reload of purgatory.
It simplified things considerably.



Any by the way, I think you have written a very great cover letter which
tells almost all details about the change. However, pity that they are
not put in patch log. In your patch log, you just tell what change is
made in the patch, but the why we need it which is the most important part
is not seen. Most of time, we can get what change has been made from the
code, surely it's very helpful if patch log has told it and can save
reviewers much time, but it's not easy to get why it's needed or
introduced if not being involved in earlier discussion or any context.
And as you know, cover letter will be stripped away whem maintainers
merge patch, only patch log is kept.


I've tried to place more information in the individual patch commit messages,
or the code itself.

I just posted v2!

Thanks for your interest and review!
eric



Thanks
Baoquan





Still reviewing.


Thank you!




digests. The arch handler also generates a new purgatory into a
buffer, performs the hash digests of the new memory segments, and then
patches purgatory with the new digests.  If all succeeds, then the
elfcorehdr and purgatory buffers over write the existing buffers and
the new kdump image is live and ready to go. No involvement with
userspace at all.








___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC v2 4/6] crash hp: generic crash hotplug support infrastructure

2021-12-07 Thread Eric DeVolder
This patch introduces a generic crash hot plug/unplug infrastructure
for CPU and memory changes. Upon CPU and memory changes, a generic
crash_hotplug_handler() obtains the appropriate lock, does some
important house keeping and then dispatches the hot plug/unplug event
to the architecture specific arch_crash_hotplug_handler(), and when
that handler returns, the lock is released.

This patch modifies crash_core.c to implement a subsys_initcall()
function that installs handlers for hot plug/unplug events. If CPU
hotplug is enabled, then cpuhp_setup_state() is invoked to register a
handler for CPU changes. Similarly, if memory hotplug is enabled, then
register_memory_notifier() is invoked to install a handler for memory
changes. These handlers in turn invoke the common generic handler
crash_hotplug_handler().

On the CPU side, cpuhp_setup_state_nocalls() is invoked with parameter
CPUHP_AP_ONLINE_DYN. While this works, when a CPU is being unplugged,
the CPU still shows up in foreach_present_cpu() during the regeneration
of the new CPU list, thus the need to explicitly check and exclude the
soon-to-be offlined CPU in crash_prepare_elf64_headers().

On the memory side, each un/plugged memory block passes through the
handler. For example, if a 1GiB DIMM is hotplugged, that generate 8
memory events, one for each 128MiB memblock.

Signed-off-by: Eric DeVolder 
---
 kernel/crash_core.c | 118 
 1 file changed, 118 insertions(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index eb53f5ec62c9..9a30a305b04d 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -8,12 +8,16 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
 
 #include 
 
+#include "kexec_internal.h"
+
 /* vmcoreinfo stuff */
 unsigned char *vmcoreinfo_data;
 size_t vmcoreinfo_size;
@@ -480,3 +484,117 @@ static int __init crash_save_vmcoreinfo_init(void)
 }
 
 subsys_initcall(crash_save_vmcoreinfo_init);
+
+#ifdef CONFIG_CRASH_HOTPLUG
+void __weak arch_crash_hotplug_handler(struct kimage *image,
+   unsigned int hp_action, unsigned long a, unsigned long b)
+{
+   pr_warn("crash hp: %s not implemented", __func__);
+}
+
+static void crash_hotplug_handler(unsigned int hp_action,
+   unsigned long a, unsigned long b)
+{
+   /* Obtain lock while changing crash information */
+   if (!mutex_trylock(_mutex))
+   return;
+
+   /* Check kdump is loaded */
+   if (kexec_crash_image) {
+   pr_debug("crash hp: hp_action %u, a %lu, b %lu", hp_action,
+   a, b);
+
+   /* Needed in order for the segments to be updated */
+   arch_kexec_unprotect_crashkres();
+
+   /* Flag to differentiate between normal load and hotplug */
+   kexec_crash_image->hotplug_event = true;
+
+   /*
+* Due to use of CPUHP_AP_ONLINE_DYN, upon unplug and during
+* this callback, the CPU is still in the for_each_present_cpu()
+* list. Must explicitly look to exclude this CPU when building
+* new list.
+*/
+   kexec_crash_image->offlinecpu =
+   (hp_action == KEXEC_CRASH_HP_REMOVE_CPU) ?
+   (unsigned int)a : ~0U;
+
+   /* Now invoke arch-specific update handler */
+   arch_crash_hotplug_handler(kexec_crash_image, hp_action, a, b);
+
+   /* No longer handling a hotplug event */
+   kexec_crash_image->hotplug_event = false;
+
+   /* Change back to read-only */
+   arch_kexec_protect_crashkres();
+   }
+
+   /* Release lock now that update complete */
+   mutex_unlock(_mutex);
+}
+
+#if defined(CONFIG_MEMORY_HOTPLUG)
+static int crash_memhp_notifier(struct notifier_block *nb,
+   unsigned long val, void *v)
+{
+   struct memory_notify *mhp = v;
+   unsigned long start, end;
+
+   start = mhp->start_pfn << PAGE_SHIFT;
+   end = ((mhp->start_pfn + mhp->nr_pages) << PAGE_SHIFT) - 1;
+
+   switch (val) {
+   case MEM_GOING_ONLINE:
+   crash_hotplug_handler(KEXEC_CRASH_HP_ADD_MEMORY,
+   start, end-start);
+   break;
+
+   case MEM_OFFLINE:
+   case MEM_CANCEL_ONLINE:
+   crash_hotplug_handler(KEXEC_CRASH_HP_REMOVE_MEMORY,
+   start, end-start);
+   break;
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block crash_memhp_nb = {
+   .notifier_call = crash_memhp_notifier,
+   .priority = 0
+};
+#endif
+
+#if defined(CONFIG_HOTPLUG_CPU)
+static int crash_cpuhp_online(unsigned int cpu)
+{
+   crash_hotplug_handler(KEXEC_CRASH_HP_ADD_CPU, cpu, 0);
+   return 0;
+}
+
+static int crash_cpuhp_offline(unsigned int cpu)
+{
+   crash_hotplug_handler(KEXEC_CRASH_HP_REMOVE_CPU, cpu, 0);
+   return 0;
+}
+#endif
+

[RFC v2 0/6] crash: Kernel handling of CPU and memory hot un/plug

2021-12-07 Thread Eric DeVolder
When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr (for x86), which describes the CPUs
and memory in the system, must also be updated, else the resulting
vmcore is inaccurate (eg. missing either CPU context or memory
regions).

The current solution utilizes udev to initiate an unload-then-reload
of the kdump image (e. kernel, initrd, boot_params, puratory and
elfcorehdr) by the userspace kexec utility. In previous posts I have
outlined the significant performance problems related to offloading
this activity to userspace.

This patchset introduces a generic crash hot un/plug handler that
registers with the CPU and memory notifiers. Upon CPU or memory
changes, this generic handler is invoked and performs important
housekeeping, for example obtaining the appropriate lock, and then
invokes an architecture specific handler to do the appropriate
updates.

In the case of x86_64, the arch specific handler generates a new
elfcorehdr, and overwrites the old one in memory. No involvement
with userspace needed.

To realize the benefits/test this patchset, one must make a couple
of minor changes to userspace:

 - Disable the udev rule for updating kdump on hot un/plug changes
   Eg. on RHEL: rm -f /usr/lib/udev/rules.d/98-kexec.rules
   or other technique to neuter the rule.

 - Change to the kexec_file_load for loading the kdump kernel:
   Eg. on RHEL: in /usr/bin/kdumpctl, change to:
standard_kexec_args="-p -d -s"
   which adds the -s to select kexec_file_load syscall.

This patchset supports kexec_load with a modified kexec userspace
utility, on which I am current working to provide separately.

Regards,
eric
---
RFC v2: 7dec2021
 - Acting upon Baoquan He suggestion of removing elfcorehdr from
   the purgatory list of segments, removed purgatory code from
   patchset, and it is signficiantly simpler now.

RFC v1: 18nov2021
 https://lkml.org/lkml/2021/11/18/845
 - working patchset demonstrating kernel handling of hotplug
   updates to x86 elfcorehdr for kexec_file_load

RFC: 14dec2020
 https://lkml.org/lkml/2020/12/14/532
 - proposed concept of allowing kernel to handle hotplug update
   of elfcorehdr
---


Eric DeVolder (6):
  crash: fix minor typo/bug in debug message
  crash hp: Introduce CRASH_HOTPLUG configuration options
  crash hp: definitions and prototype changes
  crash hp: generic crash hotplug support infrastructure
  crash hp: kexec_file changes for crash hotplug support
  crash hp: Add x86 crash hotplug support

 arch/x86/Kconfig|  26 
 arch/x86/kernel/crash.c | 140 +++-
 include/linux/kexec.h   |  21 +-
 kernel/crash_core.c | 118 +
 kernel/kexec_file.c |  15 -
 5 files changed, 314 insertions(+), 6 deletions(-)

-- 
2.27.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC v2 1/6] crash: fix minor typo/bug in debug message

2021-12-07 Thread Eric DeVolder
The pr_debug() intends to display the memsz member, but the
parameter is actually the bufsz member (which is already
displayed). Correct this to display memsz value.

Signed-off-by: Eric DeVolder 
Acked-by: Baoquan He 
---
 arch/x86/kernel/crash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index e8326a8d1c5d..9730c88530fc 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -407,7 +407,7 @@ int crash_load_segments(struct kimage *image)
}
image->elf_load_addr = kbuf.mem;
pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
-image->elf_load_addr, kbuf.bufsz, kbuf.bufsz);
+image->elf_load_addr, kbuf.bufsz, kbuf.memsz);
 
return ret;
 }
-- 
2.27.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC v2 5/6] crash hp: kexec_file changes for crash hotplug support

2021-12-07 Thread Eric DeVolder
Two important changes to note:

The kexec_calculate_store_digests() changed to specifically EXCLUDE
the elfcorehdr segment from its list of segments to check.
This is an important change as it allows, in a hotplug environment,
for the elfcorehdr segment (which contains the list of CPUs and
memory regions) to change dynamically without the need to update
purgatory (with the hash/digests of the segments it checks) as well.

The crash_prepare_elf64_headers() changed to look for the offline'd
CPU and exclude it. This due to the fact that the offline'd CPU is
still in the for_each_present_cpu() list at this point in time on
the cpu hotplug handler path.

Signed-off-by: Eric DeVolder 
---
 kernel/kexec_file.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 8347fc158d2b..339995d42169 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -765,6 +765,12 @@ static int kexec_calculate_store_digests(struct kimage 
*image)
for (j = i = 0; i < image->nr_segments; i++) {
struct kexec_segment *ksegment;
 
+#ifdef CONFIG_CRASH_HOTPLUG
+   /* This segment excluded to allow future changes via hotplug */
+   if (image->elf_index_valid && (j == image->elf_index))
+   continue;
+#endif
+
ksegment = >segment[i];
/*
 * Skip purgatory as it will be modified once we put digest
@@ -1260,8 +1266,8 @@ int crash_exclude_mem_range(struct crash_mem *mem,
return 0;
 }
 
-int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
- void **addr, unsigned long *sz)
+int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
+   int kernel_map, void **addr, unsigned long *sz)
 {
Elf64_Ehdr *ehdr;
Elf64_Phdr *phdr;
@@ -1308,6 +1314,11 @@ int crash_prepare_elf64_headers(struct crash_mem *mem, 
int kernel_map,
 
/* Prepare one phdr of type PT_NOTE for each present CPU */
for_each_present_cpu(cpu) {
+#ifdef CONFIG_CRASH_HOTPLUG
+   /* Skip the soon-to-be offlined cpu */
+   if (image->hotplug_event && (cpu == image->offlinecpu))
+   continue;
+#endif
phdr->p_type = PT_NOTE;
notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
phdr->p_offset = phdr->p_paddr = notes_addr;
-- 
2.27.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC v2 6/6] crash hp: Add x86 crash hotplug support

2021-12-07 Thread Eric DeVolder
For x86_64, when CPU or memory is hot un/plugged, the crash
elfcorehdr, which describes the CPUs and memory in the system,
must also be updated.

To update the elfcorehdr for x86_64, a new elfcorehdr must be
generated from the available CPUs and memory. The new elfcorehdr
is prepared into a buffer, and if no errors occur, it is
installed over the top of the existing elfcorehdr.

In the patch 'crash hp: kexec_file changes for crash hotplug support'
the need to update purgatory due to the change in elfcorehdr was
eliminated.  As a result, no changes to purgatory or boot_params
(as the elfcorehdr= kernel command line parameter pointer
remains unchanged and correct) are needed, just elfcorehdr.

To accommodate a growing number of resources via hotplug, the
elfcorehdr segment must be sufficiently large enough to accommodate
changes, see the CRASH_HOTPLUG_ELFCOREHDR_SZ configure item.

NOTE that this supports both kexec_load and kexec_file_load. Support
for kexec_load is made possible by identifying the elfcorehdr segment
at load time and updating it as previously described. However, it is
the responsibility of the userspace kexec utility to ensure that:
 - the elfcorehdr segment is sufficiently large enough to accommodate
   hotplug changes, ala CRASH_HOTPLUG_ELFCOREHDR_SZ.
 - provides a purgatory that excludes the elfcorehdr from its list of
   run-time segments to check.
These changes to the userspace kexec utility are not yet available.

Signed-off-by: Eric DeVolder 
---
 arch/x86/kernel/crash.c | 138 +++-
 1 file changed, 137 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9730c88530fc..d185137b33d4 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -265,7 +266,8 @@ static int prepare_elf_headers(struct kimage *image, void 
**addr,
goto out;
 
/* By default prepare 64bit headers */
-   ret =  crash_prepare_elf64_headers(cmem, IS_ENABLED(CONFIG_X86_64), 
addr, sz);
+   ret =  crash_prepare_elf64_headers(image, cmem,
+   IS_ENABLED(CONFIG_X86_64), addr, sz);
 
 out:
vfree(cmem);
@@ -397,7 +399,17 @@ int crash_load_segments(struct kimage *image)
image->elf_headers = kbuf.buffer;
image->elf_headers_sz = kbuf.bufsz;
 
+#ifdef CONFIG_CRASH_HOTPLUG
+   /* Ensure elfcorehdr segment large enough for hotplug changes */
+   kbuf.memsz = CONFIG_CRASH_HOTPLUG_ELFCOREHDR_SZ;
+   /* For marking as usable to crash kernel */
+   image->elf_headers_sz = kbuf.memsz;
+   /* Record the index of the elfcorehdr segment */
+   image->elf_index = image->nr_segments;
+   image->elf_index_valid = true;
+#else
kbuf.memsz = kbuf.bufsz;
+#endif
kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
ret = kexec_add_buffer();
@@ -412,3 +424,127 @@ int crash_load_segments(struct kimage *image)
return ret;
 }
 #endif /* CONFIG_KEXEC_FILE */
+
+#ifdef CONFIG_CRASH_HOTPLUG
+void *map_crash_pages(unsigned long paddr, unsigned long size)
+{
+   /*
+* NOTE: The addresses and sizes passed to this routine have
+* already been fully aligned on page boundaries. There is no
+* need for massaging the address or size.
+*/
+   void *ptr = NULL;
+
+   /* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */
+   if (size > 0) {
+   struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
+
+   ptr = kmap(page);
+   }
+
+   return ptr;
+}
+
+void unmap_crash_pages(void **ptr)
+{
+   if (ptr) {
+   if (*ptr)
+   kunmap(*ptr);
+   *ptr = NULL;
+   }
+}
+
+void arch_crash_hotplug_handler(struct kimage *image,
+   unsigned int hp_action, unsigned long a, unsigned long b)
+{
+   /*
+* To accurately reflect hot un/plug changes, the elfcorehdr (which
+* is passed to the crash kernel via the elfcorehdr= parameter)
+* must be updated with the new list of CPUs and memories. The new
+* elfcorehdr is prepared in a kernel buffer, and if no errors,
+* then it is written on top of the existing/old elfcorehdr.
+*
+* Due to the change to the elfcorehdr, purgatory must explicitly
+* exclude the elfcorehdr from the list of segments it checks.
+*/
+   struct kexec_segment *ksegment;
+   unsigned char *ptr = NULL;
+   unsigned long elfsz = 0;
+   void *elfbuf = NULL;
+   unsigned long mem, memsz;
+   unsigned int n;
+
+   /*
+* When the struct kimage is alloced, it is wiped to zero, so
+* the elf_index_valid defaults to false. It is set on the
+* kexec_file_load path, or here for kexec_load.
+*/
+   if (!image->elf_index_valid) {
+  

[RFC v2 2/6] crash hp: Introduce CRASH_HOTPLUG configuration options

2021-12-07 Thread Eric DeVolder
Support for CPU and memory hotplug for crash is controlled by the
CRASH_HOTPLUG configuration option, introduced by this patch.

The CRASH_HOTPLUG_ELFCOREHDR_SZ related configuration option is
also introduced with this patch.

Signed-off-by: Eric DeVolder 
---
 arch/x86/Kconfig | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7399327d1eff..901b4a6c50c1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2046,6 +2046,32 @@ config CRASH_DUMP
  (CONFIG_RELOCATABLE=y).
  For more details see Documentation/admin-guide/kdump/kdump.rst
 
+config CRASH_HOTPLUG
+   bool "kernel updates of crash elfcorehdr"
+   depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG) && KEXEC_FILE
+   help
+ Enable the kernel to update the crash elfcorehdr (which contains
+ the list of CPUs and memory regions) directly when hot plug/unplug
+ of CPUs or memory. Otherwise userspace must monitor these hot
+ plug/unplug change notifications via udev in order to
+ unload-then-reload the crash kernel so that the list of CPUs and
+ memory regions is kept up-to-date. Note that the udev CPU and
+ memory change notifications still occur (however, userspace is not
+ required to monitor for crash dump purposes).
+
+config CRASH_HOTPLUG_ELFCOREHDR_SZ
+   depends on CRASH_HOTPLUG
+   int
+   default 131072
+   help
+ Specify the maximum size of the elfcorehdr buffer/segment.
+ The 128KiB default is sized so that it can accommodate 2048
+ Elf64_Phdr, where each Phdr represents either a CPU or a
+ region of memory.
+ For example, this size can accommodate hotplugging a machine
+ with up to 1024 CPUs and up to 1024 memory regions (e.g. 1TiB
+ with 1024 1GiB memory DIMMs).
+
 config KEXEC_JUMP
bool "kexec jump"
depends on KEXEC && HIBERNATION
-- 
2.27.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[RFC v2 3/6] crash hp: definitions and prototype changes

2021-12-07 Thread Eric DeVolder
This change adds members to struct kimage to facilitate crash
hotplug support.

This change also defines crash hotplug events and associated
prototypes.

Signed-off-by: Eric DeVolder 
---
 include/linux/kexec.h | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0c994ae37729..068f853f1c65 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -221,8 +221,9 @@ struct crash_mem {
 extern int crash_exclude_mem_range(struct crash_mem *mem,
   unsigned long long mstart,
   unsigned long long mend);
-extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
-  void **addr, unsigned long *sz);
+extern int crash_prepare_elf64_headers(struct kimage *image,
+   struct crash_mem *mem, int kernel_map,
+   void **addr, unsigned long *sz);
 #endif /* CONFIG_KEXEC_FILE */
 
 #ifdef CONFIG_KEXEC_ELF
@@ -299,6 +300,13 @@ struct kimage {
 
/* Information for loading purgatory */
struct purgatory_info purgatory_info;
+
+#ifdef CONFIG_CRASH_HOTPLUG
+   bool hotplug_event;
+   int offlinecpu;
+   bool elf_index_valid;
+   int elf_index;
+#endif
 #endif
 
 #ifdef CONFIG_IMA_KEXEC
@@ -315,6 +323,15 @@ struct kimage {
unsigned long elf_load_addr;
 };
 
+#ifdef CONFIG_CRASH_HOTPLUG
+void arch_crash_hotplug_handler(struct kimage *image,
+   unsigned int hp_action, unsigned long a, unsigned long b);
+#define KEXEC_CRASH_HP_REMOVE_CPU   0
+#define KEXEC_CRASH_HP_ADD_CPU  1
+#define KEXEC_CRASH_HP_REMOVE_MEMORY 2
+#define KEXEC_CRASH_HP_ADD_MEMORY   3
+#endif /* CONFIG_CRASH_HOTPLUG */
+
 /* kexec interface functions */
 extern void machine_kexec(struct kimage *image);
 extern int machine_kexec_prepare(struct kimage *image);
-- 
2.27.0


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 0/6] KEXEC_SIG with appended signature

2021-12-07 Thread Michal Suchánek
On Tue, Dec 07, 2021 at 05:10:14PM +0100, Philipp Rudo wrote:
> Hi Michal,
> 
> i finally had the time to take a closer look at the series. Except for
> the nit in patch 4 and my personal preference in patch 6 the code looks
> good to me.
> 
> What I don't like are the commit messages on the first commits. In my
> opinion they are so short that they are almost useless. For example in
> patch 2 there is absolutely no explanation why you can simply copy the
> s390 over to ppc.

They use the same signature format. I suppose I can add a note saying
that.

> Or in patch 3 you are silently changing the error
> code in kexec from EKEYREJECT to ENODATA. So I would appreciate it if

Not sure what I should do about this. The different implementations use
different random error codes, and when they are unified the error code
clearly changes for one or the other.

Does anything depend on a particular error code returned?

Thanks

Michal

> you could improve them a little.
> 
> Thanks
> Philipp
> 
> On Thu, 25 Nov 2021 19:02:38 +0100
> Michal Suchanek  wrote:
> 
> > Hello,
> > 
> > This is resend of the KEXEC_SIG patchset.
> > 
> > The first patch is new because it'a a cleanup that does not require any
> > change to the module verification code.
> > 
> > The second patch is the only one that is intended to change any
> > functionality.
> > 
> > The rest only deduplicates code but I did not receive any review on that
> > part so I don't know if it's desirable as implemented.
> > 
> > The first two patches can be applied separately without the rest.
> > 
> > Thanks
> > 
> > Michal
> > 
> > Michal Suchanek (6):
> >   s390/kexec_file: Don't opencode appended signature check.
> >   powerpc/kexec_file: Add KEXEC_SIG support.
> >   kexec_file: Don't opencode appended signature verification.
> >   module: strip the signature marker in the verification function.
> >   module: Use key_being_used_for for log messages in
> > verify_appended_signature
> >   module: Move duplicate mod_check_sig users code to mod_parse_sig
> > 
> >  arch/powerpc/Kconfig | 11 +
> >  arch/powerpc/kexec/elf_64.c  | 14 ++
> >  arch/s390/kernel/machine_kexec_file.c| 42 ++
> >  crypto/asymmetric_keys/asymmetric_type.c |  1 +
> >  include/linux/module_signature.h |  1 +
> >  include/linux/verification.h |  4 ++
> >  kernel/module-internal.h |  2 -
> >  kernel/module.c  | 12 +++--
> >  kernel/module_signature.c| 56 +++-
> >  kernel/module_signing.c  | 33 +++---
> >  security/integrity/ima/ima_modsig.c  | 22 ++
> >  11 files changed, 113 insertions(+), 85 deletions(-)
> > 
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 6/6] module: Move duplicate mod_check_sig users code to mod_parse_sig

2021-12-07 Thread Philipp Rudo
Hi Michal,

On Thu, 25 Nov 2021 19:02:44 +0100
Michal Suchanek  wrote:

> Multiple users of mod_check_sig check for the marker, then call
> mod_check_sig, extract signature length, and remove the signature.
> 
> Put this code in one place together with mod_check_sig.
> 
> Signed-off-by: Michal Suchanek 
> ---
>  include/linux/module_signature.h|  1 +
>  kernel/module_signature.c   | 56 -
>  kernel/module_signing.c | 26 +++---
>  security/integrity/ima/ima_modsig.c | 22 ++--
>  4 files changed, 63 insertions(+), 42 deletions(-)
> 
> diff --git a/include/linux/module_signature.h 
> b/include/linux/module_signature.h
> index 7eb4b00381ac..1343879b72b3 100644
> --- a/include/linux/module_signature.h
> +++ b/include/linux/module_signature.h
> @@ -42,5 +42,6 @@ struct module_signature {
>  
>  int mod_check_sig(const struct module_signature *ms, size_t file_len,
> const char *name);
> +int mod_parse_sig(const void *data, size_t *len, size_t *sig_len, const char 
> *name);
>  
>  #endif /* _LINUX_MODULE_SIGNATURE_H */
> diff --git a/kernel/module_signature.c b/kernel/module_signature.c
> index 00132d12487c..784b40575ee4 100644
> --- a/kernel/module_signature.c
> +++ b/kernel/module_signature.c
> @@ -8,14 +8,36 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> +/**
> + * mod_check_sig_marker - check that the given data has signature marker at 
> the end
> + *
> + * @data:Data with appended signature
> + * @len: Length of data. Signature marker length is subtracted on 
> success.
> + */
> +static inline int mod_check_sig_marker(const void *data, size_t *len)

I personally don't like it when a function has a "check" in it's name
as it doesn't describe what the function is checking for. For me
mod_has_sig_marker is much more precise. I would use that instead.

Thanks
Philipp

> +{
> + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
> +
> + if (markerlen > *len)
> + return -ENODATA;
> +
> + if (memcmp(data + *len - markerlen, MODULE_SIG_STRING,
> +markerlen))
> + return -ENODATA;
> +
> + *len -= markerlen;
> + return 0;
> +}
> +
>  /**
>   * mod_check_sig - check that the given signature is sane
>   *
>   * @ms:  Signature to check.
> - * @file_len:Size of the file to which @ms is appended.
> + * @file_len:Size of the file to which @ms is appended (without the 
> marker).
>   * @name:What is being checked. Used for error messages.
>   */
>  int mod_check_sig(const struct module_signature *ms, size_t file_len,
> @@ -44,3 +66,35 @@ int mod_check_sig(const struct module_signature *ms, 
> size_t file_len,
>  
>   return 0;
>  }
> +
> +/**
> + * mod_parse_sig - check that the given signature is sane and determine 
> signature length
> + *
> + * @data:Data with appended signature.
> + * @len: Length of data. Signature and marker length is subtracted on 
> success.
> + * @sig_len: Length of signature. Filled on success.
> + * @name:What is being checked. Used for error messages.
> + */
> +int mod_parse_sig(const void *data, size_t *len, size_t *sig_len, const char 
> *name)
> +{
> + const struct module_signature *sig;
> + int rc;
> +
> + rc = mod_check_sig_marker(data, len);
> + if (rc)
> + return rc;
> +
> + if (*len < sizeof(*sig))
> + return -ENODATA;
> +
> + sig = (const struct module_signature *)(data + (*len - sizeof(*sig)));
> +
> + rc = mod_check_sig(sig, *len, name);
> + if (rc)
> + return rc;
> +
> + *sig_len = be32_to_cpu(sig->sig_len);
> + *len -= *sig_len + sizeof(*sig);
> +
> + return 0;
> +}
> diff --git a/kernel/module_signing.c b/kernel/module_signing.c
> index cef72a6f6b5d..02bbca90f467 100644
> --- a/kernel/module_signing.c
> +++ b/kernel/module_signing.c
> @@ -25,35 +25,17 @@ int verify_appended_signature(const void *data, size_t 
> *len,
> struct key *trusted_keys,
> enum key_being_used_for purpose)
>  {
> - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
>   struct module_signature ms;
> - size_t sig_len, modlen = *len;
> + size_t sig_len;
>   int ret;
>  
> - pr_devel("==>%s %s(,%zu)\n", __func__, key_being_used_for[purpose], 
> modlen);  
> + pr_devel("==>%s %s(,%zu)\n", __func__, key_being_used_for[purpose], 
> *len);
>  
> - if (markerlen > modlen)
> - return -ENODATA;
> -
> - if (memcmp(data + modlen - markerlen, MODULE_SIG_STRING,
> -markerlen))
> - return -ENODATA;
> - modlen -= markerlen;
> -
> - if (modlen <= sizeof(ms))
> - return -EBADMSG;
> -
> - memcpy(, data + (modlen - sizeof(ms)), sizeof(ms));
> -
> - ret = mod_check_sig(, modlen, key_being_used_for[purpose]);
> + ret = mod_parse_sig(data, len, 

Re: [PATCH v2 4/6] module: strip the signature marker in the verification function.

2021-12-07 Thread Philipp Rudo
Hi Michal,

On Thu, 25 Nov 2021 19:02:42 +0100
Michal Suchanek  wrote:

> It is stripped by each caller separately.
> 
> Signed-off-by: Michal Suchanek 
> ---
>  arch/powerpc/kexec/elf_64.c   |  9 -
>  arch/s390/kernel/machine_kexec_file.c |  9 -
>  kernel/module.c   |  7 +--
>  kernel/module_signing.c   | 12 ++--

kernel/module_signing.c is only compiled with MODULE_SIG enabled but
KEXEC_SIG only selects MODULE_SIG_FORMAT. In the unlikely case that
KEXEC_SIG is enabled but MODULE_SIG isn't this causes a build breakage.
So you need to update KEXEC_SIG to select MODULE_SIG instead of
MODULE_SIG_FORMAT for s390 and ppc.

Thanks
Philipp

>  4 files changed, 11 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index 266cb26d3ca0..63634c95265d 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -156,15 +156,6 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>  int elf64_verify_sig(const char *kernel, unsigned long length)
>  {
>   size_t kernel_len = length;
> - const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;
> -
> - if (marker_len > kernel_len)
> - return -EKEYREJECTED;
> -
> - if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING,
> -marker_len))
> - return -EKEYREJECTED;
> - kernel_len -= marker_len;
>  
>   return verify_appended_signature(kernel, _len, 
> VERIFY_USE_PLATFORM_KEYRING,
>   "kexec_file");
> diff --git a/arch/s390/kernel/machine_kexec_file.c 
> b/arch/s390/kernel/machine_kexec_file.c
> index 432797249db3..c4632c1a1b59 100644
> --- a/arch/s390/kernel/machine_kexec_file.c
> +++ b/arch/s390/kernel/machine_kexec_file.c
> @@ -27,20 +27,11 @@ const struct kexec_file_ops * const kexec_file_loaders[] 
> = {
>  int s390_verify_sig(const char *kernel, unsigned long length)
>  {
>   size_t kernel_len = length;
> - const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;
>  
>   /* Skip signature verification when not secure IPLed. */
>   if (!ipl_secure_flag)
>   return 0;
>  
> - if (marker_len > kernel_len)
> - return -EKEYREJECTED;
> -
> - if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING,
> -marker_len))
> - return -EKEYREJECTED;
> - kernel_len -= marker_len;
> -
>   return verify_appended_signature(kernel, _len, 
> VERIFY_USE_PLATFORM_KEYRING,
>   "kexec_file");
>  }
> diff --git a/kernel/module.c b/kernel/module.c
> index 8481933dfa92..d91ca0f93a40 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -2882,7 +2882,6 @@ static inline void kmemleak_load_module(const struct 
> module *mod,
>  static int module_sig_check(struct load_info *info, int flags)
>  {
>   int err = -ENODATA;
> - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
>   const char *reason;
>   const void *mod = info->hdr;
>  
> @@ -2890,11 +2889,7 @@ static int module_sig_check(struct load_info *info, 
> int flags)
>* Require flags == 0, as a module with version information
>* removed is no longer the module that was signed
>*/
> - if (flags == 0 &&
> - info->len > markerlen &&
> - memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) 
> == 0) {
> - /* We truncate the module to discard the signature */
> - info->len -= markerlen;
> + if (flags == 0) {
>   err = verify_appended_signature(mod, >len,
>   VERIFY_USE_SECONDARY_KEYRING, 
> "module");
>   if (!err) {
> diff --git a/kernel/module_signing.c b/kernel/module_signing.c
> index f492e410564d..4c28cb55275f 100644
> --- a/kernel/module_signing.c
> +++ b/kernel/module_signing.c
> @@ -15,8 +15,7 @@
>  #include "module-internal.h"
>  
>  /**
> - * verify_appended_signature - Verify the signature on a module with the
> - * signature marker stripped.
> + * verify_appended_signature - Verify the signature on a module
>   * @data: The data to be verified
>   * @len: Size of @data.
>   * @trusted_keys: Keyring to use for verification
> @@ -25,12 +24,21 @@
>  int verify_appended_signature(const void *data, size_t *len,
> struct key *trusted_keys, const char *what)
>  {
> + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
>   struct module_signature ms;
>   size_t sig_len, modlen = *len;
>   int ret;
>  
>   pr_devel("==>%s(,%zu)\n", __func__, modlen);  
>  
> + if (markerlen > modlen)
> + return -ENODATA;
> +
> + if (memcmp(data + modlen - markerlen, MODULE_SIG_STRING,
> +markerlen))
> + return -ENODATA;
> + modlen -= markerlen;
> +
>   if (modlen <= 

Re: [PATCH 3/3] s390: add support for --reuse-cmdline

2021-12-07 Thread Sven Schnelle
Hi Philipp,

Philipp Rudo  writes:

>> diff --git a/kexec/arch/s390/kexec-image.c b/kexec/arch/s390/kexec-image.c
>> index dbeb689b830a..310d967ea331 100644
>> --- a/kexec/arch/s390/kexec-image.c
>> +++ b/kexec/arch/s390/kexec-image.c
>> @@ -72,6 +72,10 @@ int image_s390_load_file(int argc, char **argv, struct 
>> kexec_info *info)
>>  case OPT_RAMDISK:
>>  ramdisk = optarg;
>>  break;
>> +case OPT_REUSE_CMDLINE:
>> +free(command_line);
>> +command_line = get_command_line();
>
> get_command_line reads a maximum of 2048 bytes from /prc/cmdline. With
> the configurable size on s390 defaulting to 4096 bytes this will
> ultimately cause problems. So you need to make get_command_line more
> flexible first.

Thanks for pointing this out, i wasn't aware of that limitation. So we
likely want to change that to some dynamic allocation.

Sven

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 0/6] KEXEC_SIG with appended signature

2021-12-07 Thread Philipp Rudo
Hi Michal,

i finally had the time to take a closer look at the series. Except for
the nit in patch 4 and my personal preference in patch 6 the code looks
good to me.

What I don't like are the commit messages on the first commits. In my
opinion they are so short that they are almost useless. For example in
patch 2 there is absolutely no explanation why you can simply copy the
s390 over to ppc. Or in patch 3 you are silently changing the error
code in kexec from EKEYREJECT to ENODATA. So I would appreciate it if
you could improve them a little.

Thanks
Philipp

On Thu, 25 Nov 2021 19:02:38 +0100
Michal Suchanek  wrote:

> Hello,
> 
> This is resend of the KEXEC_SIG patchset.
> 
> The first patch is new because it'a a cleanup that does not require any
> change to the module verification code.
> 
> The second patch is the only one that is intended to change any
> functionality.
> 
> The rest only deduplicates code but I did not receive any review on that
> part so I don't know if it's desirable as implemented.
> 
> The first two patches can be applied separately without the rest.
> 
> Thanks
> 
> Michal
> 
> Michal Suchanek (6):
>   s390/kexec_file: Don't opencode appended signature check.
>   powerpc/kexec_file: Add KEXEC_SIG support.
>   kexec_file: Don't opencode appended signature verification.
>   module: strip the signature marker in the verification function.
>   module: Use key_being_used_for for log messages in
> verify_appended_signature
>   module: Move duplicate mod_check_sig users code to mod_parse_sig
> 
>  arch/powerpc/Kconfig | 11 +
>  arch/powerpc/kexec/elf_64.c  | 14 ++
>  arch/s390/kernel/machine_kexec_file.c| 42 ++
>  crypto/asymmetric_keys/asymmetric_type.c |  1 +
>  include/linux/module_signature.h |  1 +
>  include/linux/verification.h |  4 ++
>  kernel/module-internal.h |  2 -
>  kernel/module.c  | 12 +++--
>  kernel/module_signature.c| 56 +++-
>  kernel/module_signing.c  | 33 +++---
>  security/integrity/ima/ima_modsig.c  | 22 ++
>  11 files changed, 113 insertions(+), 85 deletions(-)
> 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/3] s390: add variable command line size

2021-12-07 Thread Sven Schnelle
Hi Philipp,

Philipp Rudo  writes:

>> diff --git a/kexec/arch/s390/kexec-image.c b/kexec/arch/s390/kexec-image.c
>> index 3c24fdfe3c7c..7747d02399db 100644
>> --- a/kexec/arch/s390/kexec-image.c
>> +++ b/kexec/arch/s390/kexec-image.c
>> @@ -25,7 +25,7 @@
>>  #include 
>>  
>>  static uint64_t crash_base, crash_end;
>> -static char command_line[COMMAND_LINESIZE];
>> +static char *command_line;
>
> isn't this the perfect opportunity to get rid of this global variable
> and...
>
>>  static void add_segment_check(struct kexec_info *info, const void *buf,
>>size_t bufsz, unsigned long base, size_t memsz)
>> @@ -38,11 +38,16 @@ static void add_segment_check(struct kexec_info *info, 
>> const void *buf,
>>  
>>  int command_line_add(const char *str)
>
> ... simply pass the pointer as an argument ;)

The reason for it being global is that command_line_add() might get
called multiple times. But yes, we could move that variable scope into
the calling function.

Thanks
Sven

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 3/3] s390: add support for --reuse-cmdline

2021-12-07 Thread Philipp Rudo
Hi Sven,

makes absolutely sense to have this option. One problem though...

On Mon, 22 Nov 2021 08:14:01 +0100
Sven Schnelle  wrote:

> --reuse-cmdline reads the command line of the currently
> running kernel from /proc/cmdline and uses that for the
> kernel that should be kexec'd.
> 
> Signed-off-by: Sven Schnelle 
> Reviewed-by: Alexander Egorenkov 
> ---
>  kexec/arch/s390/include/arch/options.h | 10 ++
>  kexec/arch/s390/kexec-image.c  |  9 +
>  2 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/kexec/arch/s390/include/arch/options.h 
> b/kexec/arch/s390/include/arch/options.h
> index 76044a301ceb..b030b61d61be 100644
[...]
> diff --git a/kexec/arch/s390/kexec-image.c b/kexec/arch/s390/kexec-image.c
> index dbeb689b830a..310d967ea331 100644
> --- a/kexec/arch/s390/kexec-image.c
> +++ b/kexec/arch/s390/kexec-image.c
> @@ -72,6 +72,10 @@ int image_s390_load_file(int argc, char **argv, struct 
> kexec_info *info)
>   case OPT_RAMDISK:
>   ramdisk = optarg;
>   break;
> + case OPT_REUSE_CMDLINE:
> + free(command_line);
> + command_line = get_command_line();

get_command_line reads a maximum of 2048 bytes from /prc/cmdline. With
the configurable size on s390 defaulting to 4096 bytes this will
ultimately cause problems. So you need to make get_command_line more
flexible first.

Thanks
Philipp

> + break;
>   }
>   }
>  
> @@ -123,6 +127,10 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   if (command_line_add(optarg))
>   return -1;
>   break;
> + case OPT_REUSE_CMDLINE:
> + free(command_line);
> + command_line = get_command_line();
> + break;
>   case OPT_RAMDISK:
>   ramdisk = optarg;
>   break;
> @@ -223,5 +231,6 @@ image_s390_usage(void)
>   printf("--command-line=STRING Set the kernel command line to STRING.\n"
>  "--append=STRING   Set the kernel command line to STRING.\n"
>  "--initrd=FILENAME Use the file FILENAME as a ramdisk.\n"
> +"--reuse-cmdline   Use kernel command line from running 
> system.\n"
>   );
>  }


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/3] s390: add variable command line size

2021-12-07 Thread Philipp Rudo
Hi Sven,

On Mon, 22 Nov 2021 08:13:59 +0100
Sven Schnelle  wrote:

> Newer s390 kernels support a command line size longer than 896
> bytes. Such kernels contain a new member in the parameter area,
> which might be utilized by tools like kexec. Older kernels have
> the location initialized to zero, so we check whether there's a
> non-zero number present and use that. If there isn't, we fallback
> to the legacy command line size of 896 bytes.
> 
> Signed-off-by: Sven Schnelle 
> Reviewed-by: Alexander Egorenkov 
> ---
>  kexec/arch/s390/kexec-image.c | 53 ---
>  kexec/arch/s390/kexec-s390.h  | 19 +++--
>  2 files changed, 46 insertions(+), 26 deletions(-)
> 
> diff --git a/kexec/arch/s390/kexec-image.c b/kexec/arch/s390/kexec-image.c
> index 3c24fdfe3c7c..7747d02399db 100644
> --- a/kexec/arch/s390/kexec-image.c
> +++ b/kexec/arch/s390/kexec-image.c
> @@ -25,7 +25,7 @@
>  #include 
>  
>  static uint64_t crash_base, crash_end;
> -static char command_line[COMMAND_LINESIZE];
> +static char *command_line;

isn't this the perfect opportunity to get rid of this global variable
and...

>  static void add_segment_check(struct kexec_info *info, const void *buf,
> size_t bufsz, unsigned long base, size_t memsz)
> @@ -38,11 +38,16 @@ static void add_segment_check(struct kexec_info *info, 
> const void *buf,
>  
>  int command_line_add(const char *str)

... simply pass the pointer as an argument ;)

Thanks
Philipp

>  {
> - if (strlen(command_line) + strlen(str) + 1 > COMMAND_LINESIZE) {
> - fprintf(stderr, "Command line too long.\n");
> + char *tmp = NULL;
> +
> + tmp = concat_cmdline(command_line, str);
> + if (!tmp) {
> + fprintf(stderr, "out of memory\n");
>   return -1;
>   }
> - strcat(command_line, str);
> +
> + free(command_line);
> + command_line = tmp;
>   return 0;
>  }
>  
> @@ -78,6 +83,8 @@ int image_s390_load_file(int argc, char **argv, struct 
> kexec_info *info)
>   if (info->initrd_fd == -1) {
>   fprintf(stderr, "Could not open initrd file %s:%s\n",
>   ramdisk, strerror(errno));
> + free(command_line);
> + command_line = NULL;
>   return -1;
>   }
>   }
> @@ -97,7 +104,7 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   const char *ramdisk;
>   off_t ramdisk_len;
>   unsigned int ramdisk_origin;
> - int opt;
> + int opt, ret = -1;
>  
>   if (info->file_mode)
>   return image_s390_load_file(argc, argv, info);
> @@ -112,7 +119,6 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   };
>   static const char short_options[] = KEXEC_OPT_STR "";
>  
> - command_line[0] = 0;
>   ramdisk = NULL;
>   ramdisk_len = 0;
>   ramdisk_origin = 0;
> @@ -132,7 +138,7 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   if (info->kexec_flags & KEXEC_ON_CRASH) {
>   if (parse_iomem_single("Crash kernel\n", _base,
>  _end))
> - return -1;
> + goto out;
>   }
>  
>   /* Add kernel segment */
> @@ -151,7 +157,7 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   rd_buffer = slurp_file_mmap(ramdisk, _len);
>   if (rd_buffer == NULL) {
>   fprintf(stderr, "Could not read ramdisk.\n");
> - return -1;
> + goto out;
>   }
>   ramdisk_origin = MAX(RAMDISK_ORIGIN_ADDR, kernel_size);
>   ramdisk_origin = _ALIGN_UP(ramdisk_origin, 0x10);
> @@ -160,7 +166,7 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   }
>   if (info->kexec_flags & KEXEC_ON_CRASH) {
>   if (load_crashdump_segments(info, crash_base, crash_end))
> - return -1;
> + goto out;
>   } else {
>   info->entry = (void *) IMAGE_READ_OFFSET;
>   }
> @@ -183,15 +189,28 @@ image_s390_load(int argc, char **argv, const char 
> *kernel_buf,
>   *tmp = crash_end - crash_base + 1;
>   }
>   }
> - /*
> -  * We will write a probably given command line.
> -  * First, erase the old area, then setup the new parameters:
> -  */
> - if (strlen(command_line) != 0) {
> - memset(krnl_buffer + COMMAND_LINE_OFFS, 0, COMMAND_LINESIZE);
> - memcpy(krnl_buffer + COMMAND_LINE_OFFS, command_line, 
> strlen(command_line));
> +
> + if (command_line) {
> + unsigned long maxsize;
> + char *dest = krnl_buffer + COMMAND_LINE_OFFS;
> +
> + maxsize = *(unsigned long *)(krnl_buffer + 
> MAX_COMMAND_LINESIZE_OFFS);
> + if 

Re: [PATCH] kernel/crash_core: suppress unknown crashkernel parameter warning

2021-12-07 Thread Philipp Rudo
Hi Baoquan,

On Tue, 7 Dec 2021 11:34:46 +0800
Baoquan He  wrote:

> On 12/06/21 at 12:17pm, Philipp Rudo wrote:
> > When booting with crashkernel= on the kernel command line a warning
> > similar to
> > 
> > [0.038294] Kernel command line: ro console=ttyS0 crashkernel=256M
> > [0.038353] Unknown kernel command line parameters "crashkernel=256M", 
> > will be passed to user space.
> > 
> > is printed. This originates from crashkernel= being parsed independent from
> > the early_param() mechanism. So the code in init/main.c doesn't know  
> 
> Not only the early_param(), __setup() also takes the same mechanism.
> It's just handled in different stage. You might need to call it kernel
> param handling mechanism, not sure if it's accurate.

you are right, "kernel param handling" is better. I used early_param as
that's where we would need to hook into if we wanted to use the common
kernel param handling. But I don't think it is worth it.

@akpm: do you update the commit message before sending the patch to
   Linus or shall I send a v2?

> > that crashkernel= is a valid kernel parameter and prints this incorrect
> > warning. Suppress the warning by adding a dummy early_param handler for
> > crashkernel=.  
> 
> The fix looks good to me, thanks.
> 
> Acked-by: Baoquan He 

Thanks

> By the way, on which arch did you find this issue? Ask because I am
> wondering whether there's any other similiar independent kernel cmdline
> handling from __setup_param(). If have, is there a chance to take a
> common method to handle them, e.g a generic function or a place to
> identify them. Just wild thought, I have no idea yet. Otherwise, we may
> need several this kind of dummy handler for each one.

The issue was first reported on s390 but I used x86 to test the fix.
The only other reported parameter I encountered was BOOT_IMAGE= which
is not a kernel parameter and thus correct. But in the corresponding
bugzilla Andrew (on cc) said "Gah! I thought I had squashed all of
these interesting uses of the kernel command line, it is like playing 
whack-a-mole."
So I believe there were multiple other parameters that had the same problem.

Thanks
Philipp

> > Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
> > Signed-off-by: Philipp Rudo 
> > ---
> >  kernel/crash_core.c | 11 +++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index eb53f5ec62c9..256cf6db573c 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -6,6 +6,7 @@
> >  
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  
> > @@ -295,6 +296,16 @@ int __init parse_crashkernel_low(char *cmdline,
> > "crashkernel=", suffix_tbl[SUFFIX_LOW]);
> >  }
> >  
> > +/*
> > + * Add a dummy early_param handler to mark crashkernel= as a known command 
> > line
> > + * parameter and suppress incorrect warnings in init/main.c.
> > + */
> > +static int __init parse_crashkernel_dummy(char *arg)
> > +{
> > +   return 0;
> > +}
> > +early_param("crashkernel", parse_crashkernel_dummy);
> > +
> >  Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> >   void *data, size_t data_len)
> >  {
> > -- 
> > 2.31.1
> > 
> > 
> > ___
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> >   
> 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RESEND v2 3/5] mm_zone: add function to check if managed dma zone exists

2021-12-07 Thread David Hildenbrand
On 07.12.21 04:07, Baoquan He wrote:
> In some places of the current kernel, it assumes that dma zone must have
> managed pages if CONFIG_ZONE_DMA is enabled. While this is not always true.
> E.g in kdump kernel of x86_64, only low 1M is presented and locked down
> at very early stage of boot, so that there's no managed pages at all in
> DMA zone. This exception will always cause page allocation failure if page
> is requested from DMA zone.
> 
> Here add function has_managed_dma() and the relevant helper functions to
> check if there's DMA zone with managed pages. It will be used in later
> patches.
> 
> Signed-off-by: Baoquan He 
> ---
>  include/linux/mmzone.h | 21 +
>  mm/page_alloc.c| 11 +++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 58e744b78c2c..82d23e13e0e5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -998,6 +998,18 @@ static inline bool zone_is_zone_device(struct zone *zone)
>  }
>  #endif
>  
> +#ifdef CONFIG_ZONE_DMA
> +static inline bool zone_is_dma(struct zone *zone)
> +{
> + return zone_idx(zone) == ZONE_DMA;
> +}
> +#else
> +static inline bool zone_is_dma(struct zone *zone)
> +{
> + return false;
> +}
> +#endif
> +
>  /*
>   * Returns true if a zone has pages managed by the buddy allocator.
>   * All the reclaim decisions have to use this function rather than
> @@ -1046,6 +1058,7 @@ static inline int is_highmem_idx(enum zone_type idx)
>  #endif
>  }
>  
> +bool has_managed_dma(void);
>  /**
>   * is_highmem - helper function to quickly check if a struct zone is a
>   *  highmem zone or not.  This is an attempt to keep references
> @@ -1131,6 +1144,14 @@ extern struct zone *next_zone(struct zone *zone);
>   ; /* do nothing */  \
>   else
>  
> +#define for_each_managed_zone(zone)  \
> + for (zone = (first_online_pgdat())->node_zones; \
> +  zone;  \
> +  zone = next_zone(zone))\
> + if (!managed_zone(zone))\
> + ; /* do nothing */  \
> + else
> +
>  static inline struct zone *zonelist_zone(struct zoneref *zoneref)
>  {
>   return zoneref->zone;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c5952749ad40..ac0ea42a4e5f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -9459,4 +9459,15 @@ bool take_page_off_buddy(struct page *page)
>   spin_unlock_irqrestore(>lock, flags);
>   return ret;
>  }
> +
> +bool has_managed_dma(void)
> +{
> + struct zone *zone;
> +
> + for_each_managed_zone(zone) {
> + if (zone_is_dma(zone))
> + return true;
> + }
> + return false;
> +}

Wouldn't it be "easier/faster" to just iterate online nodes and directly
obtain the ZONE_DMA, checking if there are managed pages?


-- 
Thanks,

David / dhildenb


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [RFC PATCH 01/14] fs/proc/vmcore: Update read_from_oldmem() for user pointer

2021-12-07 Thread Christoph Hellwig
On Mon, Dec 06, 2021 at 03:07:15PM +, Matthew Wilcox wrote:
> > > What do you think to adding a generic copy_pfn_to_iter()?  Not sure
> > > which APIs to use to implement it ... some architectures have weird
> > > requirements about which APIs can be used for what kinds of PFNs.
> > 
> > Hmm.  I though kmap_local_pfn(_prot) is all we need?
> 
> In the !HIGHMEM case, that calls pfn_to_page(), and I think the
> point of this path is that we don't have a struct page for this pfn.

Indeed.  But to me this suggest that the !highmem stub is broken and
we should probably fix it rather than adding yet another interface.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH RESEND v2 0/5] Avoid requesting page from DMA zone when no managed pages

2021-12-07 Thread Christoph Lameter
On Tue, 7 Dec 2021, Baoquan He wrote:

> into ZONE_DMA32 by default. The zone DMA covering low 16M is used to
> take care of antique ISA devices. In fact, on 64bit system, it rarely
> need ZONE_DMA (which is low 16M) to support almost extinct ISA devices.
> However, some components treat DMA as a generic concept, e.g
> kmalloc-dma, slab allocator initializes it for later any DMA related
> buffer allocation, but not limited to ISA DMA.

The idea of the slab allocator DMA support is to have memory available
for devices that can only support a limited range of physical addresses.
These are only to be enabled for platforms that have such requirements.

The slab allocators guarantee that all kmalloc allocations are DMA able
indepent of specifying ZONE_DMA/ZONE_DMA32

> On arm64, even though both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32
> are enabled, it makes ZONE_DMA covers the low 4G area, and ZONE_DMA32
> empty. Unless on specific platforms (e.g. 30-bit on Raspberry Pi 4),
> then zone DMA covers the 1st 1G area, zone DMA32 covers the rest of
> the 32-bit addressable memory.

ZONE_NORMAL should cover all memory. ARM does not need ZONE_DMA32.

> I am wondering if we can also change the size of DMA and DMA32 ZONE as
> dynamically adjusted, just as arm64 is doing? On x86_64, we can make
> zone DMA covers the 32-bit addressable memory, and empty zone DMA32 by
> default. Once ISA_DMA_API is enabled, we go back to make zone DMA covers
> low 16M area, zone DMA32 covers the rest of 32-bit addressable memory.
> (I am not familiar with ISA_DMA_API, will it require 24-bit addressable
> memory when enabled?)

The size of ZONE_DMA is traditionally depending on the platform. On some
it is 16MB, on some 1G and on some 4GB. ZONE32 is always 4GB and should
only be used if ZONE_DMA has already been used.

ZONE_DMA is dynamic in the sense of being different on different
platforms.

Generally I guess it would be possible to use ZONE_DMA for generic tagging
of special memory that can be configured to have a dynamic size but that is
not what it was designed to do.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec