Re: [PATCH v8 2/4] tpm: of: Make of-tree specific function commonly available

2022-09-01 Thread Jarkko Sakkinen
On Thu, Sep 01, 2022 at 05:46:08PM -0400, Stefan Berger wrote:
> Simplify tpm_read_log_of() by moving reusable parts of the code into
> an inline function that makes it commonly available so it can be
> used also for kexec support. Call the new of_tpm_get_sml_parameters()
> function from the TPM Open Firmware driver.
> 
> Signed-off-by: Stefan Berger 
> Cc: Jarkko Sakkinen 
> Cc: Jason Gunthorpe 
> Cc: Rob Herring 
> Cc: Frank Rowand 
> Reviewed-by: Mimi Zohar 
> Tested-by: Nageswara R Sastry 
> Reviewed-by: Jarkko Sakkinen 
> Tested-by: Coiby Xu 
> 
> ---
> v7:
>  - Added original comment back into inlined function
> 
> v4:
>  - converted to inline function
> ---
>  drivers/char/tpm/eventlog/of.c | 31 +
>  include/linux/tpm.h| 36 ++
>  2 files changed, 41 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/char/tpm/eventlog/of.c b/drivers/char/tpm/eventlog/of.c
> index a9ce66d09a75..f9462d19632e 100644
> --- a/drivers/char/tpm/eventlog/of.c
> +++ b/drivers/char/tpm/eventlog/of.c
> @@ -12,6 +12,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "../tpm.h"
> @@ -20,11 +21,10 @@
>  int tpm_read_log_of(struct tpm_chip *chip)
>  {
>   struct device_node *np;
> - const u32 *sizep;
> - const u64 *basep;
>   struct tpm_bios_log *log;
>   u32 size;
>   u64 base;
> + int ret;
>  
>   log = >log;
>   if (chip->dev.parent && chip->dev.parent->of_node)
> @@ -35,30 +35,9 @@ int tpm_read_log_of(struct tpm_chip *chip)
>   if (of_property_read_bool(np, "powered-while-suspended"))
>   chip->flags |= TPM_CHIP_FLAG_ALWAYS_POWERED;
>  
> - sizep = of_get_property(np, "linux,sml-size", NULL);
> - basep = of_get_property(np, "linux,sml-base", NULL);
> - if (sizep == NULL && basep == NULL)
> - return -ENODEV;
> - if (sizep == NULL || basep == NULL)
> - return -EIO;
> -
> - /*
> -  * For both vtpm/tpm, firmware has log addr and log size in big
> -  * endian format. But in case of vtpm, there is a method called
> -  * sml-handover which is run during kernel init even before
> -  * device tree is setup. This sml-handover function takes care
> -  * of endianness and writes to sml-base and sml-size in little
> -  * endian format. For this reason, vtpm doesn't need conversion
> -  * but physical tpm needs the conversion.
> -  */
> - if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
> - of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
> - size = be32_to_cpup((__force __be32 *)sizep);
> - base = be64_to_cpup((__force __be64 *)basep);
> - } else {
> - size = *sizep;
> - base = *basep;
> - }
> + ret = of_tpm_get_sml_parameters(np, , );
> + if (ret < 0)
> + return ret;
>  
>   if (size == 0) {
>   dev_warn(>dev, "%s: Event log area empty\n", __func__);
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index dfeb25a0362d..6356baaa1393 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -460,4 +460,40 @@ static inline struct tpm_chip *tpm_default_chip(void)
>   return NULL;
>  }
>  #endif
> +
> +#ifdef CONFIG_OF
> +static inline int of_tpm_get_sml_parameters(struct device_node *np,
> + u64 *base, u32 *size)
> +{
> + const u32 *sizep;
> + const u64 *basep;
> +
> + sizep = of_get_property(np, "linux,sml-size", NULL);
> + basep = of_get_property(np, "linux,sml-base", NULL);
> + if (sizep == NULL && basep == NULL)
> + return -ENODEV;
> + if (sizep == NULL || basep == NULL)
> + return -EIO;
> +
> + /*
> +  * For both vtpm/tpm, firmware has log addr and log size in big
> +  * endian format. But in case of vtpm, there is a method called
> +  * sml-handover which is run during kernel init even before
> +  * device tree is setup. This sml-handover function takes care
> +  * of endianness and writes to sml-base and sml-size in little
> +  * endian format. For this reason, vtpm doesn't need conversion
> +  * but physical tpm needs the conversion.
> +  */
> + if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
> + of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
> + *size = be32_to_cpup((__force __be32 *)sizep);
> + *base = be64_to_cpup((__force __be64 *)basep);
> + } else {
> + *size = *sizep;
> + *base = *basep;
> + }
> + return 0;
> +}
> +#endif
> +
>  #endif
> -- 
> 2.35.1
> 

Acked-by: Jarkko Sakkinen 

BR, Jarkko

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v8 2/4] tpm: of: Make of-tree specific function commonly available

2022-09-01 Thread Stefan Berger
Simplify tpm_read_log_of() by moving reusable parts of the code into
an inline function that makes it commonly available so it can be
used also for kexec support. Call the new of_tpm_get_sml_parameters()
function from the TPM Open Firmware driver.

Signed-off-by: Stefan Berger 
Cc: Jarkko Sakkinen 
Cc: Jason Gunthorpe 
Cc: Rob Herring 
Cc: Frank Rowand 
Reviewed-by: Mimi Zohar 
Tested-by: Nageswara R Sastry 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Coiby Xu 

---
v7:
 - Added original comment back into inlined function

v4:
 - converted to inline function
---
 drivers/char/tpm/eventlog/of.c | 31 +
 include/linux/tpm.h| 36 ++
 2 files changed, 41 insertions(+), 26 deletions(-)

diff --git a/drivers/char/tpm/eventlog/of.c b/drivers/char/tpm/eventlog/of.c
index a9ce66d09a75..f9462d19632e 100644
--- a/drivers/char/tpm/eventlog/of.c
+++ b/drivers/char/tpm/eventlog/of.c
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "../tpm.h"
@@ -20,11 +21,10 @@
 int tpm_read_log_of(struct tpm_chip *chip)
 {
struct device_node *np;
-   const u32 *sizep;
-   const u64 *basep;
struct tpm_bios_log *log;
u32 size;
u64 base;
+   int ret;
 
log = >log;
if (chip->dev.parent && chip->dev.parent->of_node)
@@ -35,30 +35,9 @@ int tpm_read_log_of(struct tpm_chip *chip)
if (of_property_read_bool(np, "powered-while-suspended"))
chip->flags |= TPM_CHIP_FLAG_ALWAYS_POWERED;
 
-   sizep = of_get_property(np, "linux,sml-size", NULL);
-   basep = of_get_property(np, "linux,sml-base", NULL);
-   if (sizep == NULL && basep == NULL)
-   return -ENODEV;
-   if (sizep == NULL || basep == NULL)
-   return -EIO;
-
-   /*
-* For both vtpm/tpm, firmware has log addr and log size in big
-* endian format. But in case of vtpm, there is a method called
-* sml-handover which is run during kernel init even before
-* device tree is setup. This sml-handover function takes care
-* of endianness and writes to sml-base and sml-size in little
-* endian format. For this reason, vtpm doesn't need conversion
-* but physical tpm needs the conversion.
-*/
-   if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
-   of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
-   size = be32_to_cpup((__force __be32 *)sizep);
-   base = be64_to_cpup((__force __be64 *)basep);
-   } else {
-   size = *sizep;
-   base = *basep;
-   }
+   ret = of_tpm_get_sml_parameters(np, , );
+   if (ret < 0)
+   return ret;
 
if (size == 0) {
dev_warn(>dev, "%s: Event log area empty\n", __func__);
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index dfeb25a0362d..6356baaa1393 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -460,4 +460,40 @@ static inline struct tpm_chip *tpm_default_chip(void)
return NULL;
 }
 #endif
+
+#ifdef CONFIG_OF
+static inline int of_tpm_get_sml_parameters(struct device_node *np,
+   u64 *base, u32 *size)
+{
+   const u32 *sizep;
+   const u64 *basep;
+
+   sizep = of_get_property(np, "linux,sml-size", NULL);
+   basep = of_get_property(np, "linux,sml-base", NULL);
+   if (sizep == NULL && basep == NULL)
+   return -ENODEV;
+   if (sizep == NULL || basep == NULL)
+   return -EIO;
+
+   /*
+* For both vtpm/tpm, firmware has log addr and log size in big
+* endian format. But in case of vtpm, there is a method called
+* sml-handover which is run during kernel init even before
+* device tree is setup. This sml-handover function takes care
+* of endianness and writes to sml-base and sml-size in little
+* endian format. For this reason, vtpm doesn't need conversion
+* but physical tpm needs the conversion.
+*/
+   if (of_property_match_string(np, "compatible", "IBM,vtpm") < 0 &&
+   of_property_match_string(np, "compatible", "IBM,vtpm20") < 0) {
+   *size = be32_to_cpup((__force __be32 *)sizep);
+   *base = be64_to_cpup((__force __be64 *)basep);
+   } else {
+   *size = *sizep;
+   *base = *basep;
+   }
+   return 0;
+}
+#endif
+
 #endif
-- 
2.35.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v8 1/4] drivers: of: kexec ima: Support 32-bit platforms

2022-09-01 Thread Stefan Berger
From: Palmer Dabbelt 

RISC-V recently added kexec_file() support, which uses enables kexec
IMA.  We're the first 32-bit platform to support this, so we found a
build bug.

Acked-by: Rob Herring 
Signed-off-by: Palmer Dabbelt 
Reviewed-by: Mimi Zohar 
---
 drivers/of/kexec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index e6c01db393f9..548dd5b1b5c1 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -250,8 +250,8 @@ static int setup_ima_buffer(const struct kimage *image, 
void *fdt,
if (ret)
return -EINVAL;
 
-   pr_debug("IMA buffer at 0x%llx, size = 0x%zx\n",
-image->ima_buffer_addr, image->ima_buffer_size);
+   pr_debug("IMA buffer at 0x%pa, size = 0x%zx\n",
+>ima_buffer_addr, image->ima_buffer_size);
 
return 0;
 }
-- 
2.35.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v8 3/4] of: kexec: Refactor IMA buffer related functions to make them reusable

2022-09-01 Thread Stefan Berger
Refactor IMA buffer related functions to make them reusable for carrying
TPM logs across kexec.

Signed-off-by: Stefan Berger 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Mimi Zohar 
Reviewed-by: Mimi Zohar 
Reviewed-by: Rob Herring 
Tested-by: Nageswara R Sastry 
Tested-by: Coiby Xu 

---
v6:
 - Add __init to get_kexec_buffer as suggested by Jonathan

v5:
 - Rebased on Jonathan McDowell's commit "b69a2afd5afc x86/kexec: Carry
   forward IMA measurement log on kexec"
v4:
 - Move debug output into setup_buffer()
---
 drivers/of/kexec.c | 126 ++---
 1 file changed, 74 insertions(+), 52 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 548dd5b1b5c1..15a82b574f36 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -117,45 +117,57 @@ static int do_get_kexec_buffer(const void *prop, int len, 
unsigned long *addr,
 }
 
 #ifdef CONFIG_HAVE_IMA_KEXEC
-/**
- * ima_get_kexec_buffer - get IMA buffer from the previous kernel
- * @addr:  On successful return, set to point to the buffer contents.
- * @size:  On successful return, set to the buffer size.
- *
- * Return: 0 on success, negative errno on error.
- */
-int __init ima_get_kexec_buffer(void **addr, size_t *size)
+static int __init get_kexec_buffer(const char *name, unsigned long *addr,
+  size_t *size)
 {
int ret, len;
-   unsigned long tmp_addr;
unsigned long start_pfn, end_pfn;
-   size_t tmp_size;
const void *prop;
 
-   prop = of_get_property(of_chosen, "linux,ima-kexec-buffer", );
+   prop = of_get_property(of_chosen, name, );
if (!prop)
return -ENOENT;
 
-   ret = do_get_kexec_buffer(prop, len, _addr, _size);
+   ret = do_get_kexec_buffer(prop, len, addr, size);
if (ret)
return ret;
 
-   /* Do some sanity on the returned size for the ima-kexec buffer */
-   if (!tmp_size)
+   /* Do some sanity on the returned size for the kexec buffer */
+   if (!*size)
return -ENOENT;
 
/*
 * Calculate the PFNs for the buffer and ensure
 * they are with in addressable memory.
 */
-   start_pfn = PHYS_PFN(tmp_addr);
-   end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1);
+   start_pfn = PHYS_PFN(*addr);
+   end_pfn = PHYS_PFN(*addr + *size - 1);
if (!page_is_ram(start_pfn) || !page_is_ram(end_pfn)) {
-   pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n",
-   tmp_addr, tmp_size);
+   pr_warn("%s buffer at 0x%lx, size = 0x%zx beyond memory\n",
+   name, *addr, *size);
return -EINVAL;
}
 
+   return 0;
+}
+
+/**
+ * ima_get_kexec_buffer - get IMA buffer from the previous kernel
+ * @addr:  On successful return, set to point to the buffer contents.
+ * @size:  On successful return, set to the buffer size.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int __init ima_get_kexec_buffer(void **addr, size_t *size)
+{
+   int ret;
+   unsigned long tmp_addr;
+   size_t tmp_size;
+
+   ret = get_kexec_buffer("linux,ima-kexec-buffer", _addr, _size);
+   if (ret)
+   return ret;
+
*addr = __va(tmp_addr);
*size = tmp_size;
 
@@ -188,72 +200,82 @@ int __init ima_free_kexec_buffer(void)
 }
 #endif
 
-/**
- * remove_ima_buffer - remove the IMA buffer property and reservation from @fdt
- *
- * @fdt: Flattened Device Tree to update
- * @chosen_node: Offset to the chosen node in the device tree
- *
- * The IMA measurement buffer is of no use to a subsequent kernel, so we always
- * remove it from the device tree.
- */
-static void remove_ima_buffer(void *fdt, int chosen_node)
+static int remove_buffer(void *fdt, int chosen_node, const char *name)
 {
int ret, len;
unsigned long addr;
size_t size;
const void *prop;
 
-   if (!IS_ENABLED(CONFIG_HAVE_IMA_KEXEC))
-   return;
-
-   prop = fdt_getprop(fdt, chosen_node, "linux,ima-kexec-buffer", );
+   prop = fdt_getprop(fdt, chosen_node, name, );
if (!prop)
-   return;
+   return -ENOENT;
 
ret = do_get_kexec_buffer(prop, len, , );
-   fdt_delprop(fdt, chosen_node, "linux,ima-kexec-buffer");
+   fdt_delprop(fdt, chosen_node, name);
if (ret)
-   return;
+   return ret;
 
ret = fdt_find_and_del_mem_rsv(fdt, addr, size);
if (!ret)
-   pr_debug("Removed old IMA buffer reservation.\n");
+   pr_debug("Remove old %s buffer reserveration", name);
+   return ret;
 }
 
-#ifdef CONFIG_IMA_KEXEC
 /**
- * setup_ima_buffer - add IMA buffer information to the fdt
- * @image: kexec image being loaded.
- * @fdt:   Flattened device tree for the next kernel.
- * @chosen_node:   Offset to the chosen node.
+ * 

[PATCH v8 4/4] tpm/kexec: Duplicate TPM measurement log in of-tree for kexec

2022-09-01 Thread Stefan Berger
The memory area of the TPM measurement log is currently not properly
duplicated for carrying it across kexec when an Open Firmware
Devicetree is used. Therefore, the contents of the log get corrupted.
Fix this for the kexec_file_load() syscall by allocating a buffer and
copying the contents of the existing log into it. The new buffer is
preserved across the kexec and a pointer to it is available when the new
kernel is started. To achieve this, store the allocated buffer's address
in the flattened device tree (fdt) under the name linux,tpm-kexec-buffer
and search for this entry early in the kernel startup before the TPM
subsystem starts up. Adjust the pointer in the of-tree stored under
linux,sml-base to point to this buffer holding the preserved log. The TPM
driver can then read the base address from this entry when making the log
available. Invalidate the log by removing 'linux,sml-base' from the
devicetree if anything goes wrong with updating the buffer.

Use subsys_initcall() to call the function to restore the buffer even if
the TPM subsystem or driver are not used. This allows the buffer to be
carried across the next kexec without involvement of the TPM subsystem
and ensures a valid buffer pointed to by the of-tree.

Use the subsys_initcall(), rather than an ealier initcall, since
page_is_ram() in get_kexec_buffer() only starts working at this stage.

Signed-off-by: Stefan Berger 
Cc: Rob Herring 
Cc: Frank Rowand 
Cc: Eric Biederman 
Tested-by: Nageswara R Sastry 
Tested-by: Coiby Xu 
Reviewed-by: Rob Herring 

---
v6:
 - Define prototype for tpm_add_kexec_buffer under same config options
   as drivers/of/kexec.c is compiled, provide inline function otherwise.
   (kernel test robot)

v4:
 - Added #include  due to parisc
 - Use phys_addr_t for physical address rather than void *
 - Remove linux,sml-base if the buffer cannot be updated after a kexec
 - Added __init to functions where possible
---
 drivers/of/kexec.c| 216 +-
 include/linux/kexec.h |   6 ++
 include/linux/of.h|   9 +-
 kernel/kexec_file.c   |   6 ++
 4 files changed, 234 insertions(+), 3 deletions(-)

diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 15a82b574f36..dd926e057215 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -19,6 +19,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define RNG_SEED_SIZE  128
 
@@ -116,7 +118,6 @@ static int do_get_kexec_buffer(const void *prop, int len, 
unsigned long *addr,
return 0;
 }
 
-#ifdef CONFIG_HAVE_IMA_KEXEC
 static int __init get_kexec_buffer(const char *name, unsigned long *addr,
   size_t *size)
 {
@@ -151,6 +152,7 @@ static int __init get_kexec_buffer(const char *name, 
unsigned long *addr,
return 0;
 }
 
+#ifdef CONFIG_HAVE_IMA_KEXEC
 /**
  * ima_get_kexec_buffer - get IMA buffer from the previous kernel
  * @addr:  On successful return, set to point to the buffer contents.
@@ -239,7 +241,6 @@ static void remove_ima_buffer(void *fdt, int chosen_node)
remove_buffer(fdt, chosen_node, "linux,ima-kexec-buffer");
 }
 
-#ifdef CONFIG_IMA_KEXEC
 static int setup_buffer(void *fdt, int chosen_node, const char *name,
phys_addr_t addr, size_t size)
 {
@@ -263,6 +264,7 @@ static int setup_buffer(void *fdt, int chosen_node, const 
char *name,
 
 }
 
+#ifdef CONFIG_IMA_KEXEC
 /**
  * setup_ima_buffer - add IMA buffer information to the fdt
  * @image: kexec image being loaded.
@@ -285,6 +287,213 @@ static inline int setup_ima_buffer(const struct kimage 
*image, void *fdt,
 }
 #endif /* CONFIG_IMA_KEXEC */
 
+/**
+ * tpm_get_kexec_buffer - get TPM log buffer from the previous kernel
+ * @phyaddr:   On successful return, set to physical address of buffer
+ * @size:  On successful return, set to the buffer size.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+static int __init tpm_get_kexec_buffer(phys_addr_t *phyaddr, size_t *size)
+{
+   unsigned long tmp_addr;
+   size_t tmp_size;
+   int ret;
+
+   ret = get_kexec_buffer("linux,tpm-kexec-buffer", _addr, _size);
+   if (ret)
+   return ret;
+
+   *phyaddr = (phys_addr_t)tmp_addr;
+   *size = tmp_size;
+
+   return 0;
+}
+
+/**
+ * tpm_of_remove_kexec_buffer - remove the linux,tpm-kexec-buffer node
+ */
+static int __init tpm_of_remove_kexec_buffer(void)
+{
+   struct property *prop;
+
+   prop = of_find_property(of_chosen, "linux,tpm-kexec-buffer", NULL);
+   if (!prop)
+   return -ENOENT;
+
+   return of_remove_property(of_chosen, prop);
+}
+
+/**
+ * remove_tpm_buffer - remove the TPM log buffer property and reservation from 
@fdt
+ *
+ * @fdt: Flattened Device Tree to update
+ * @chosen_node: Offset to the chosen node in the device tree
+ *
+ * The TPM log measurement buffer is of no use to a subsequent kernel, so we 
always
+ * remove it from the device tree.
+ */
+static void 

[PATCH v8 0/4] tpm: Preserve TPM measurement log across kexec (ppc64)

2022-09-01 Thread Stefan Berger
The of-tree subsystem does not currently preserve the IBM vTPM 1.2 and
vTPM 2.0 measurement logs across a kexec on PowerVM and PowerKVM. This
series fixes this for the kexec_file_load() syscall using the flattened
device tree (fdt) to carry the TPM measurement log's buffer across kexec.

   Stefan

v8:
 - Added Jarkko's, Coiby's, and Rob's tags
 - Rebase on v6.0-rc3 that absorbed 2 already upstreamed patches

v7:
 - Added Nageswara's Tested-by tags
 - Added back original comment to inline function and removed Jarkko's R-b tag

v6:
 - Add __init to get_kexec_buffer as suggested by Jonathan
 - Fixed issue detected by kernel test robot

v5:
 - Rebased on 1 more patch that would otherwise create merge conflicts

v4:
 - Rebased on 2 patches that would otherwise create merge conflicts;
   posting these patches in this series with several tags removed so
   krobot can test the series already
 - Changes to individual patches documented in patch descripitons

v3:
 - Moved TPM Open Firmware related function to 
drivers/char/tpm/eventlog/tpm_of.c

v2:
 - rearranged patches
 - fixed compilation issues for x86

Palmer Dabbelt (1):
  drivers: of: kexec ima: Support 32-bit platforms

Stefan Berger (3):
  tpm: of: Make of-tree specific function commonly available
  of: kexec: Refactor IMA buffer related functions to make them reusable
  tpm/kexec: Duplicate TPM measurement log in of-tree for kexec

 drivers/char/tpm/eventlog/of.c |  31 +--
 drivers/of/kexec.c | 336 -
 include/linux/kexec.h  |   6 +
 include/linux/of.h |   9 +-
 include/linux/tpm.h|  36 
 kernel/kexec_file.c|   6 +
 6 files changed, 346 insertions(+), 78 deletions(-)


base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
-- 
2.35.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3] arm64/kexec: Fix missing extra range for crashkres_low.

2022-09-01 Thread Will Deacon
On Wed, 31 Aug 2022 19:39:13 +0900, Levi Yun wrote:
> Like crashk_res, Calling crash_exclude_mem_range function with
> crashk_low_res area would need extra crash_mem range too.
> 
> Add one more extra cmem slot in case of crashk_low_res is used.
> 
> 

Applied to arm64 (for-next/fixes), thanks!

[1/1] arm64/kexec: Fix missing extra range for crashkres_low.
  https://git.kernel.org/arm64/c/4831be702b95

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] arm64, kdump: enforce to take 4G as the crashkernel low memory end

2022-09-01 Thread Baoquan He
On 09/01/22 at 10:24am, Mike Rapoport wrote:
> On Wed, Aug 31, 2022 at 10:29:39PM +0800, Baoquan He wrote:
> > On 08/31/22 at 10:37am, Mike Rapoport wrote:
> > > On Sun, Aug 28, 2022 at 08:55:44AM +0800, Baoquan He wrote:
> > > > 
> > > > Solution:
> > > > =
> > > > To fix the problem, we should always take 4G as the crashkernel low
> > > > memory end in case CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled.
> > > > With this, we don't need to defer the crashkernel reservation till
> > > > bootmem_init() is called to set the arm64_dma_phys_limit. As long as
> > > > memblock init is done, we can conclude what is the upper limit of low
> > > > memory zone.
> > > > 
> > > > 1) both CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled or 
> > > > memblock_start_of_DRAM() > 4G
> > > >   limit = PHYS_ADDR_MAX+1  (Corner cases)
> > > 
> > > Why these are corner cases? 
> > > The case when CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled is the
> > > simplest one because it does not require the whole dancing around
> > > arm64_dma_phys_limit initialization.
> > > 
> > > And AFAIK, memblock_start_of_DRAM() > 4G is not uncommon on arm64, but it
> > > does not matter for device DMA addressing.
> > 
> > Thanks for reviewing.
> > 
> > I could be wrong and have misunderstanding about corner case.
> > 
> > With my understanding, both ZONE_DMA and ZONE_DMA32 are enabled by
> > default in kernel. And on distros, I believe they are on too. The both
> > ZONE_DMA and ZONE_DMA32 disabled case should only exist on one specific
> > product, and the memblock_start_of_DRAM() > 4G case too. At least, I
> > haven't seen one in our LAB. What I thought the non generic as corner
> > case could be wrong. I will change that phrasing.
> > 
> > mm/Kconfig:
> > config ZONE_DMA
> > bool "Support DMA zone" if ARCH_HAS_ZONE_DMA_SET
> > default y if ARM64 || X86
> > 
> > config ZONE_DMA32
> > bool "Support DMA32 zone" if ARCH_HAS_ZONE_DMA_SET
> > depends on !X86_32
> > default y if ARM64
> 
> My point was that the cases with ZONE_DMA/DMA32 disabled or with RAM above
> 4G do not require detection of arm64_dma_phys_limit before reserving the
> crash kernel, can use predefined constants and are simple to handle.

I see.

>  
> > > The actual corner cases are systems with ZONE_DMA/DMA32 and with <32 bits
> > > limit for device DMA addressing (e.g RPi 4). I think the changelog should
> > 
> > Right, RPi4's 30bit DMA addressing device is corner case.
> > 
> > > mention that to use kdump on these devices user must specify
> > > crashkernel=X@Y 
> > 
> > Makes sense. I will add words in log, and add sentences to
> > mention that in code comment or some place of document.
> > Thanks for advice.
> > 
> > > 
> > > > 2) CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are enabled:
> > > >limit = 4G  (generic case)
> > > > 
> 
> ...
> 
> > > > +static phys_addr_t __init crash_addr_low_max(void)
> > > > +{
> > > > +   phys_addr_t low_mem_mask = U32_MAX;
> > > > +   phys_addr_t phys_start = memblock_start_of_DRAM();
> > > > +
> > > > +   if ((!IS_ENABLED(CONFIG_ZONE_DMA) && 
> > > > !IS_ENABLED(CONFIG_ZONE_DMA32)) ||
> > > > +(phys_start > U32_MAX))
> > > > +   low_mem_mask = PHYS_ADDR_MAX;
> > > > +
> > > > +   return min(low_mem_mask, memblock_end_of_DRAM() - 1) + 1;
> > > 
> > > Since RAM frequently starts on non-zero address the limit for systems with
> > > ZONE_DMA/DMA32 should be memblock_start_of_DRAM() + 4G. There is no need 
> > > to
> > 
> > It may not be right for memblock_start_of_DRAM(). On most of arm64
> > servers I ever tested, their memblock usually starts from a higher
> > address, but not zero which is like x86. E.g below memory ranges printed
> > on an ampere-mtsnow-altra system, the starting addr is 0x8300. With
> > my understanding, DMA addressing bits correspond to the cpu logical
> > address range devices can address. So memblock_start_of_DRAM() + 4G
> > seems not right for normal system, and not right for system which
> > starting physical address is above 4G. I refer to max_zone_phys() of
> > arch/arm64/mm/init.c when implementing crash_addr_low_max(). Please
> > correct me if I am wrong.
> 
> My understanding was that no matter where DRAM starts, the first 4G would
> be accessible by 32-bit devices, but I maybe wrong as well :)
> 
> I haven't notice you used max_zone_phys() as a reference. Wouldn't it be
> simpler to just call it from crash_addr_low_max():
> 
> static phys_addr_t __init crash_addr_low_max(void)
> {
>   return max_zone_phys(32);
> }

max_zone_phys() only handles cases when CONFIG_ZONE_DMA/DMA32 enabled,
the disabledCONFIG_ZONE_DMA/DMA32 case is not included. I can change
it like:

static phys_addr_t __init crash_addr_low_max(void)
{
phys_addr_t low_mem_mask = U32_MAX;
phys_addr_t phys_start = memblock_start_of_DRAM();

if ((!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) ||
 

Re: [PATCH 1/2] arm64, kdump: enforce to take 4G as the crashkernel low memory end

2022-09-01 Thread Mike Rapoport
On Wed, Aug 31, 2022 at 10:29:39PM +0800, Baoquan He wrote:
> On 08/31/22 at 10:37am, Mike Rapoport wrote:
> > On Sun, Aug 28, 2022 at 08:55:44AM +0800, Baoquan He wrote:
> > > 
> > > Solution:
> > > =
> > > To fix the problem, we should always take 4G as the crashkernel low
> > > memory end in case CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled.
> > > With this, we don't need to defer the crashkernel reservation till
> > > bootmem_init() is called to set the arm64_dma_phys_limit. As long as
> > > memblock init is done, we can conclude what is the upper limit of low
> > > memory zone.
> > > 
> > > 1) both CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled or 
> > > memblock_start_of_DRAM() > 4G
> > >   limit = PHYS_ADDR_MAX+1  (Corner cases)
> > 
> > Why these are corner cases? 
> > The case when CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled is the
> > simplest one because it does not require the whole dancing around
> > arm64_dma_phys_limit initialization.
> > 
> > And AFAIK, memblock_start_of_DRAM() > 4G is not uncommon on arm64, but it
> > does not matter for device DMA addressing.
> 
> Thanks for reviewing.
> 
> I could be wrong and have misunderstanding about corner case.
> 
> With my understanding, both ZONE_DMA and ZONE_DMA32 are enabled by
> default in kernel. And on distros, I believe they are on too. The both
> ZONE_DMA and ZONE_DMA32 disabled case should only exist on one specific
> product, and the memblock_start_of_DRAM() > 4G case too. At least, I
> haven't seen one in our LAB. What I thought the non generic as corner
> case could be wrong. I will change that phrasing.
> 
> mm/Kconfig:
> config ZONE_DMA
> bool "Support DMA zone" if ARCH_HAS_ZONE_DMA_SET
> default y if ARM64 || X86
> 
> config ZONE_DMA32
> bool "Support DMA32 zone" if ARCH_HAS_ZONE_DMA_SET
> depends on !X86_32
> default y if ARM64

My point was that the cases with ZONE_DMA/DMA32 disabled or with RAM above
4G do not require detection of arm64_dma_phys_limit before reserving the
crash kernel, can use predefined constants and are simple to handle.
 
> > The actual corner cases are systems with ZONE_DMA/DMA32 and with <32 bits
> > limit for device DMA addressing (e.g RPi 4). I think the changelog should
> 
> Right, RPi4's 30bit DMA addressing device is corner case.
> 
> > mention that to use kdump on these devices user must specify
> > crashkernel=X@Y 
> 
> Makes sense. I will add words in log, and add sentences to
> mention that in code comment or some place of document.
> Thanks for advice.
> 
> > 
> > > 2) CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are enabled:
> > >limit = 4G  (generic case)
> > > 

...

> > > +static phys_addr_t __init crash_addr_low_max(void)
> > > +{
> > > + phys_addr_t low_mem_mask = U32_MAX;
> > > + phys_addr_t phys_start = memblock_start_of_DRAM();
> > > +
> > > + if ((!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) ||
> > > +  (phys_start > U32_MAX))
> > > + low_mem_mask = PHYS_ADDR_MAX;
> > > +
> > > + return min(low_mem_mask, memblock_end_of_DRAM() - 1) + 1;
> > 
> > Since RAM frequently starts on non-zero address the limit for systems with
> > ZONE_DMA/DMA32 should be memblock_start_of_DRAM() + 4G. There is no need to
> 
> It may not be right for memblock_start_of_DRAM(). On most of arm64
> servers I ever tested, their memblock usually starts from a higher
> address, but not zero which is like x86. E.g below memory ranges printed
> on an ampere-mtsnow-altra system, the starting addr is 0x8300. With
> my understanding, DMA addressing bits correspond to the cpu logical
> address range devices can address. So memblock_start_of_DRAM() + 4G
> seems not right for normal system, and not right for system which
> starting physical address is above 4G. I refer to max_zone_phys() of
> arch/arm64/mm/init.c when implementing crash_addr_low_max(). Please
> correct me if I am wrong.

My understanding was that no matter where DRAM starts, the first 4G would
be accessible by 32-bit devices, but I maybe wrong as well :)

I haven't notice you used max_zone_phys() as a reference. Wouldn't it be
simpler to just call it from crash_addr_low_max():

static phys_addr_t __init crash_addr_low_max(void)
{
return max_zone_phys(32);
}
 
-- 
Sincerely yours,
Mike.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec