Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
> 
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the data(
> encrypted or unencrypted). For example, when sme enabled, if the old memory
> is encrypted, we will remap the old memory in encrypted way, which will
> automatically decrypt the old memory encrypted when we read those data from
> the remapping address.
> 
>  --
> | first-kernel | second-kernel | kdump support |
> |  (mem_encrypt=on|off)|   (yes|no)|
> |--+---+---|
> | on   | on| yes   |
> | off  | off   | yes   |
> | on   | off   | no|
> | off  | on| no|
> |__|___|___|
> 
> Signed-off-by: Lianbo Jiang 
> ---
>  arch/x86/include/asm/dmi.h  | 14 +-
>  arch/x86/kernel/acpi/boot.c |  8 
>  arch/x86/kernel/crash_dump_64.c | 27 +++
>  drivers/acpi/tables.c   | 14 +-
>  drivers/iommu/amd_iommu_init.c  |  9 -
>  fs/proc/vmcore.c| 36 +++-
>  include/linux/crash_dump.h  |  4 
>  kernel/kexec_core.c | 12 
>  8 files changed, 116 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
> index 0ab2ab2..a5663b4 100644
> --- a/arch/x86/include/asm/dmi.h
> +++ b/arch/x86/include/asm/dmi.h
> @@ -7,6 +7,10 @@
>  
>  #include 
>  #include 
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

I don't think you need all of the #ifdef stuff throughout this
patch.  Everything should work just fine without it.

> +#include 
> +#include 
> +#endif
>  
>  static __always_inline __init void *dmi_alloc(unsigned len)
>  {
> @@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
>  }
>  
>  /* Use early IO mappings for DMI because it's initialized early */
> -#define dmi_early_remap  early_memremap
> +static __always_inline __init void *dmi_early_remap(resource_size_t
> + phys_addr, unsigned long size)
> +{
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

Again, no need for the #ifdef here.  You should probably audit the
code for all of these and truly determine if they are really needed.

> + if (sme_active() && is_kdump_kernel())

Use of sme_active() here is good since under SEV, this area will be
encrypted.

> + return early_memremap_decrypted(phys_addr, size);
> +#endif
> + return early_memremap(phys_addr, size);

Instead of doing this, maybe it makes more sense to put this logic
somewhere in the early_memremap() path.  Possibly smarten up the
early_memremap_pgprot_adjust() function with some kdump kernel
related logic.  Not sure it's possible, but would be nice since you
have this logic in a couple of places.

> +}
>  #define dmi_early_unmap  early_memunmap
>  #define dmi_remap(_x, _l)memremap(_x, _l, MEMREMAP_WB)
>  #define dmi_unmap(_x)memunmap(_x)
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 3b20607..354ad66 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -48,6 +48,10 @@
>  #include 
>  #include 
>  #include 
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +#include 
> +#include 
> +#endif
>  
>  #include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
>  static int __initdata acpi_force = 0;
> @@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long 
> phys, unsigned long size)
>   if (!phys || !size)
>   return NULL;
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_active() && is_kdump_kernel())
> + return early_memremap_decrypted(phys, size);
> +#endif

Same as previous comment(s).

>   return early_memremap(phys, size);
>  }
>  
> diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
> index 4f2e077..2ef67fc 100644
> --- a/arch/x86/kernel/crash_dump_64.c
> +++ b/arch/

Re: [PATCH v9 06/11] arm64: kexec_file: allow for loading Image-format kernel

2018-05-15 Thread James Morse
Hi Akashi,

On 15/05/18 06:13, AKASHI Takahiro wrote:
> On Fri, May 11, 2018 at 06:07:06PM +0100, James Morse wrote:
>> On 07/05/18 08:21, AKASHI Takahiro wrote:
>>> On Tue, May 01, 2018 at 06:46:11PM +0100, James Morse wrote:
 On 25/04/18 07:26, AKASHI Takahiro wrote:
> This patch provides kexec_file_ops for "Image"-format kernel. In this
> implementation, a binary is always loaded with a fixed offset identified
> in text_offset field of its header.
>>
> diff --git a/arch/arm64/include/asm/kexec.h 
> b/arch/arm64/include/asm/kexec.h
> index e4de1223715f..3cba4161818a 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h

 Could we check branch_code is non-zero, and text-offset points within 
 image-size?
>>>
>>> We could do it, but I don't think this check is very useful.
>>>

 We could check that this platform supports the page-size/endian config 
 that this
 Image was built with... We get a message from the EFI stub if the page-size
 can't be supported, it would be nice to do the same here (as we can).
>>>
>>> There is no restriction on page-size or endianness for kexec.
>>
>> No, but it won't boot if the hardware doesn't support it. The kernel will 
>> spin
>> at a magic address that is, difficult, to debug without JTAG. The bug report
>> will be "it didn't boot".
> 
> OK.
> Added sanity checks for cpu features, endianness as well as page size.
> 
>>
>>> What will be the purpose of this check?
>>
>> These values are in the header so that the bootloader can check them, then 
>> print
>> a meaningful error. Here, kexec_file_load() is playing the part of the 
>> bootloader.

>> I'm assuming kexec_file_load() can only be used to kexec linux... unlike 
>> regular
>> kexec. Is this where I'm going wrong?

Trying to work this out for myself: we can't support any UEFI application as we
can't give it the boot-services environment, so I'm pretty sure
kexec_file_load() must be linux-specific.

Can we state somewhere that we only expect arm64 linux to be booted with
kexec_file_load()? Its not clear from the kconfig text, which refers to kexec,
which explicitly states it can boot other OS. But for kexec_file_load() we're
following the kernel's booting.txt.


> diff --git a/arch/arm64/kernel/kexec_image.c 
> b/arch/arm64/kernel/kexec_image.c
> new file mode 100644
> index ..4dd524ad6611
> --- /dev/null
> +++ b/arch/arm64/kernel/kexec_image.c
> @@ -0,0 +1,79 @@

> +static void *image_load(struct kimage *image,
> + char *kernel, unsigned long kernel_len,
> + char *initrd, unsigned long initrd_len,
> + char *cmdline, unsigned long cmdline_len)
> +{
> + struct kexec_buf kbuf;
> + struct arm64_image_header *h = (struct arm64_image_header *)kernel;
> + unsigned long text_offset;
> + int ret;
> +
> + /* Load the kernel */
> + kbuf.image = image;
> + kbuf.buf_min = 0;
> + kbuf.buf_max = ULONG_MAX;
> + kbuf.top_down = false;
> +
> + kbuf.buffer = kernel;
> + kbuf.bufsz = kernel_len;
> + kbuf.memsz = le64_to_cpu(h->image_size);
> + text_offset = le64_to_cpu(h->text_offset);
> + kbuf.buf_align = SZ_2M;

> + /* Adjust kernel segment with TEXT_OFFSET */
> + kbuf.memsz += text_offset;
> +
> + ret = kexec_add_buffer(&kbuf);
> + if (ret)
> + goto out;
> +
> + image->arch.kern_segment = image->nr_segments - 1;

 You only seem to use kern_segment here, and in load_other_segments() called
 below. Could it not be a local variable passed in? Instead of 
 arch-specific data
 we keep forever?
>>>
>>> No, kern_segment is also used in load_other_segments() in 
>>> machine_kexec_file.c.
>>> To optimize memory hole allocation logic in locate_mem_hole_callback(),
>>> we need to know the exact range of kernel image (start and end).
>>
>> That's the second user. My badly-made point is one calls the other, but 
>> passes
>> the data via some until-kexec lifetime struct. (its not important, just an
>> indicator this worked differently in the past and hasn't been cleaned up).
>> I meant something like [0].
> 
> OK, but instead of adding kern_seg, I want to change the interface to:
> 
> | extern int load_other_segments(struct kimage *image,
> | unsigned long kernel_load_addr, unsigned long kernel_size,
> | char *initrd, unsigned long initrd_len,
> | char *cmdline, unsigned long cmdline_len);
> 
> This way, we will in future be able to address an issue I mentioned in
> my previous e-mail. (If we support vmlinux, the kernel occupies two segments
> for text and data, respectively.)

Aha, its not from old-stuff, its for future-stuff!


James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailm

Re: [PATCH v9 07/11] arm64: kexec_file: add crash dump support

2018-05-15 Thread James Morse
Hi guys,

(CC: +RobH, devicetree list)

On 25/04/18 07:26, AKASHI Takahiro wrote:
> Enabling crash dump (kdump) includes
> * prepare contents of ELF header of a core dump file, /proc/vmcore,
>   using crash_prepare_elf64_headers(), and
> * add two device tree properties, "linux,usable-memory-range" and
>   "linux,elfcorehdr", which represent repsectively a memory range
>   to be used by crash dump kernel and the header's location

kexec_file_load() on arm64 needs to be able to create a prop encoded array to
the FDT, but there doesn't appear to be a libfdt helper to do this.

Akashi's code below adds fdt_setprop_range() to the arch code, and duplicates
bits of libfdt_internal.h to do the work.

How should this be done? I'm assuming this is something we need a new API in
libfdt.h for. How do these come about, and is there an interim step we can use
until then?

Thanks!

James

> diff --git a/arch/arm64/kernel/machine_kexec_file.c 
> b/arch/arm64/kernel/machine_kexec_file.c
> index 37c0a9dc2e47..ec674f4d267c 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
>   return ret;
>  }
>  
> +static int __init arch_kexec_file_init(void)
> +{
> + /* Those values are used later on loading the kernel */
> + __dt_root_addr_cells = dt_root_addr_cells;
> + __dt_root_size_cells = dt_root_size_cells;
> +
> + return 0;
> +}
> +late_initcall(arch_kexec_file_init);
> +
> +#define FDT_ALIGN(x, a)  (((x) + (a) - 1) & ~((a) - 1))
> +#define FDT_TAGALIGN(x)  (FDT_ALIGN((x), FDT_TAGSIZE))
> +
> +static int fdt_prop_len(const char *prop_name, int len)
> +{
> + return (strlen(prop_name) + 1) +
> + sizeof(struct fdt_property) +
> + FDT_TAGALIGN(len);
> +}
> +
> +static bool cells_size_fitted(unsigned long base, unsigned long size)
> +{
> + /* if *_cells >= 2, cells can hold 64-bit values anyway */
> + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
> + return false;
> +
> + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
> + return false;
> +
> + return true;
> +}
> +
> +static void fill_property(void *buf, u64 val64, int cells)
> +{
> + u32 val32;
> +
> + if (cells == 1) {
> + val32 = cpu_to_fdt32((u32)val64);
> + memcpy(buf, &val32, sizeof(val32));
> + } else {
> + memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
> + buf += cells * sizeof(u32) - sizeof(u64);
> +
> + val64 = cpu_to_fdt64(val64);
> + memcpy(buf, &val64, sizeof(val64));
> + }
> +}
> +
> +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
> + unsigned long addr, unsigned long size)
> +{
> + void *buf, *prop;
> + size_t buf_size;
> + int result;
> +
> + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> + prop = buf = vmalloc(buf_size);
> + if (!buf)
> + return -ENOMEM;
> +
> + fill_property(prop, addr, __dt_root_addr_cells);
> + prop += __dt_root_addr_cells * sizeof(u32);
> +
> + fill_property(prop, size, __dt_root_size_cells);
> +
> + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
> +
> + vfree(buf);
> +
> + return result;
> +}
> +
>  static int setup_dtb(struct kimage *image,
>   unsigned long initrd_load_addr, unsigned long initrd_len,
>   char *cmdline, unsigned long cmdline_len,
> @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image,
>   int range_len;
>   int ret;
>  
> + /* check ranges against root's #address-cells and #size-cells */
> + if (image->type == KEXEC_TYPE_CRASH &&
> + (!cells_size_fitted(image->arch.elf_load_addr,
> + image->arch.elf_headers_sz) ||
> +  !cells_size_fitted(crashk_res.start,
> + crashk_res.end - crashk_res.start + 1))) {
> + pr_err("Crash memory region doesn't fit into DT's root cell 
> sizes.\n");
> + ret = -EINVAL;
> + goto out_err;
> + }
> +
>   /* duplicate dt blob */
>   buf_size = fdt_totalsize(initial_boot_params);
>   range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
>  
> + if (image->type == KEXEC_TYPE_CRASH)
> + buf_size += fdt_prop_len("linux,elfcorehdr", range_len)
> + + fdt_prop_len("linux,usable-memory-range",
> + range_len);
> +
>   if (initrd_load_addr)
>   buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64))
>   + fdt_prop_len("linux,initrd-end", sizeof(u64));
> @@ -113,6 +206,23 @@ static int setup_dtb(struct kimage *image,
>   if (nodeoffset < 0)
>   goto out_err;
>  
> + if (im

Re: [PATCH v9 07/11] arm64: kexec_file: add crash dump support

2018-05-15 Thread James Morse
Hi Akashi,

On 25/04/18 07:26, AKASHI Takahiro wrote:
> Enabling crash dump (kdump) includes
> * prepare contents of ELF header of a core dump file, /proc/vmcore,
>   using crash_prepare_elf64_headers(), and
> * add two device tree properties, "linux,usable-memory-range" and
>   "linux,elfcorehdr", which represent repsectively a memory range

(Nit: respectively)


>   to be used by crash dump kernel and the header's location

>  arch/arm64/include/asm/kexec.h |   4 +
>  arch/arm64/kernel/kexec_image.c|   9 +-
>  arch/arm64/kernel/machine_kexec_file.c | 202 +

In this patch, machine_kexec_file.c gains its own private fdt array encoder.


> diff --git a/arch/arm64/kernel/machine_kexec_file.c 
> b/arch/arm64/kernel/machine_kexec_file.c
> index 37c0a9dc2e47..ec674f4d267c 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
>   return ret;
>  }
>  
> +static int __init arch_kexec_file_init(void)
> +{
> + /* Those values are used later on loading the kernel */
> + __dt_root_addr_cells = dt_root_addr_cells;
> + __dt_root_size_cells = dt_root_size_cells;
> +
> + return 0;
> +}
> +late_initcall(arch_kexec_file_init);

If we need these is it worth taking them out of __initdata? I note they've been
'temporary' for quite a long time.


> +
> +#define FDT_ALIGN(x, a)  (((x) + (a) - 1) & ~((a) - 1))
> +#define FDT_TAGALIGN(x)  (FDT_ALIGN((x), FDT_TAGSIZE))
> +
> +static int fdt_prop_len(const char *prop_name, int len)
> +{
> + return (strlen(prop_name) + 1) +
> + sizeof(struct fdt_property) +
> + FDT_TAGALIGN(len);
> +}

This stuff should really be in libfdt.h  Those macros come from
libfdt_internal.h, so we're probably doing something wrong here.


> +static bool cells_size_fitted(unsigned long base, unsigned long size)
> +{
> + /* if *_cells >= 2, cells can hold 64-bit values anyway */
> + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32)))
> + return false;
> +
> + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32)))
> + return false;

Using '> U32_MAX' here may be more readable.


> + return true;
> +}
> +
> +static void fill_property(void *buf, u64 val64, int cells)
> +{
> + u32 val32;
> +
> + if (cells == 1) {
> + val32 = cpu_to_fdt32((u32)val64);
> + memcpy(buf, &val32, sizeof(val32));
> + } else {

> + memset(buf, 0, cells * sizeof(u32) - sizeof(u64));
> + buf += cells * sizeof(u32) - sizeof(u64);

Is this trying to clear the 'top' cells and shuffle the pointer to point at the
'bottom' 2? I'm pretty sure this isn't endian safe.

Do we really expect a system to have #address-cells > 2?


> + val64 = cpu_to_fdt64(val64);
> + memcpy(buf, &val64, sizeof(val64));
> + }
> +}
> +
> +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name,
> + unsigned long addr, unsigned long size)

(the device-tree spec describes a 'ranges' property, which had me confused. This
is encoding a prop-encoded-array)

> +{
> + void *buf, *prop;
> + size_t buf_size;
> + int result;
> +
> + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);
> + prop = buf = vmalloc(buf_size);

virtual memory allocation for something less than PAGE_SIZE?


> + if (!buf)
> + return -ENOMEM;
> +
> + fill_property(prop, addr, __dt_root_addr_cells);
> + prop += __dt_root_addr_cells * sizeof(u32);
> +
> + fill_property(prop, size, __dt_root_size_cells);
> +
> + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
> +
> + vfree(buf);
> +
> + return result;
> +}

Doesn't this stuff belong in libfdt? I guess there is no 'add array element' api
because this the first time we've wanted to create a node with more than
key=fixed-size-value.

I don't think this belongs in arch C code. Do we have a plan for getting libfdt
to support encoding prop-arrays? Can we put it somewhere anyone else duplicating
this will find it, until we can (re)move it?

I have no idea how that happens... it looks like the devicetree list is the
place to ask.


>  static int setup_dtb(struct kimage *image,
>   unsigned long initrd_load_addr, unsigned long initrd_len,
>   char *cmdline, unsigned long cmdline_len,
> @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image,
>   int range_len;
>   int ret;
>  
> + /* check ranges against root's #address-cells and #size-cells */
> + if (image->type == KEXEC_TYPE_CRASH &&
> + (!cells_size_fitted(image->arch.elf_load_addr,
> + image->arch.elf_headers_sz) ||
> +  !cells_size_fitted(crashk_res.start,
> + crashk_res.end - crashk_res.start + 1))) {
> +  

Re: [PATCH v9 05/11] arm64: kexec_file: load initrd and device-tree

2018-05-15 Thread James Morse
Hi Akashi,

On 25/04/18 07:26, AKASHI Takahiro wrote:
> load_other_segments() is expected to allocate and place all the necessary
> memory segments other than kernel, including initrd and device-tree
> blob (and elf core header for crash).
> While most of the code was borrowed from kexec-tools' counterpart,
> users may not be allowed to specify dtb explicitly, instead, the dtb
> presented by a boot loader is reused.

(Nit: "a boot loader" -> "the original boot loader")

> arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
> specific data allocated in load_other_segments().


> diff --git a/arch/arm64/kernel/machine_kexec_file.c 
> b/arch/arm64/kernel/machine_kexec_file.c
> index f9ebf54ca247..b3b9b1725d8a 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -13,7 +13,26 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
> +#include 
> +#include 
> +
> +static int __dt_root_addr_cells;
> +static int __dt_root_size_cells;

> @@ -55,3 +74,144 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  
>   return ret;
>  }
> +
> +static int setup_dtb(struct kimage *image,
> + unsigned long initrd_load_addr, unsigned long initrd_len,
> + char *cmdline, unsigned long cmdline_len,
> + char **dtb_buf, size_t *dtb_buf_len)
> +{
> + char *buf = NULL;
> + size_t buf_size;
> + int nodeoffset;
> + u64 value;
> + int range_len;
> + int ret;
> +
> + /* duplicate dt blob */
> + buf_size = fdt_totalsize(initial_boot_params);
> + range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32);

These two cells values are 0 here. Did you want
arch_kexec_file_init() in patch 7 in this patch?

Ah, range_len isn't used, so, did you want the cells values and this range_len
thing in in patch 7!?


> +
> + if (initrd_load_addr)
> + buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64))
> + + fdt_prop_len("linux,initrd-end", sizeof(u64));
> +
> + if (cmdline)
> + buf_size += fdt_prop_len("bootargs", cmdline_len + 1);

I can't find where fdt_prop_len()  oh, patch 7. fdt_prop_len() doesn't look
like the sort of thing that should be created here, but I agree there isn't an
existing API to do this.

(This must be why powerpc guesses that the fdt won't be more than double in 
size).


> + buf = vmalloc(buf_size);
> + if (!buf) {
> + ret = -ENOMEM;
> + goto out_err;
> + }
> +
> + ret = fdt_open_into(initial_boot_params, buf, buf_size);
> + if (ret)
> + goto out_err;
> +
> + nodeoffset = fdt_path_offset(buf, "/chosen");
> + if (nodeoffset < 0)
> + goto out_err;
> +
> + /* add bootargs */
> + if (cmdline) {
> + ret = fdt_setprop(buf, nodeoffset, "bootargs",
> + cmdline, cmdline_len + 1);

fdt_setprop_string()?


> + if (ret)
> + goto out_err;
> + }
> +
> + /* add initrd-* */
> + if (initrd_load_addr) {
> + value = cpu_to_fdt64(initrd_load_addr);
> + ret = fdt_setprop(buf, nodeoffset, "linux,initrd-start",
> + &value, sizeof(value));

sizeof(value) was assumed to be the same as sizeof(u64) earlier.
fdt_setprop_u64()?


> + if (ret)
> + goto out_err;
> +
> + value = cpu_to_fdt64(initrd_load_addr + initrd_len);
> + ret = fdt_setprop(buf, nodeoffset, "linux,initrd-end",
> + &value, sizeof(value));
> + if (ret)
> + goto out_err;
> + }
> +
> + /* trim a buffer */
> + fdt_pack(buf);
> + *dtb_buf = buf;
> + *dtb_buf_len = fdt_totalsize(buf);
> +
> + return 0;
> +
> +out_err:
> + vfree(buf);
> + return ret;
> +}

While powerpc has some similar code for updating the initrd and cmdline, it
makes different assumptions about the size of the dt, and has different behavior
for memreserve. (looks like we don't expect the initramfs to be memreserved).
Lets leave unifying that stuff where possible for the future.


> +int load_other_segments(struct kimage *image,
> + char *initrd, unsigned long initrd_len,
> + char *cmdline, unsigned long cmdline_len)
> +{
> + struct kexec_segment *kern_seg;
> + struct kexec_buf kbuf;
> + unsigned long initrd_load_addr = 0;
> + char *dtb = NULL;
> + unsigned long dtb_len = 0;
> + int ret = 0;
> +
> + kern_seg = &image->segment[image->arch.kern_segment];
> + kbuf.image = image;
> + /* not allocate anything below the kernel */
> + kbuf.buf_min = kern_seg->mem + kern_seg->memsz;

> + /* load initrd */
> + if (initrd) {
> + kbuf.buffer = initrd;
> + kbuf.bufsz = initrd_len;
> + kbu

Re: [PATCH v9 04/11] arm64: kexec_file: allocate memory walking through memblock list

2018-05-15 Thread James Morse
Hi Akashi,

On 15/05/18 05:35, AKASHI Takahiro wrote:
> On Mon, May 07, 2018 at 02:59:07PM +0900, AKASHI Takahiro wrote:
>> On Tue, May 01, 2018 at 06:46:09PM +0100, James Morse wrote:
>>> On 25/04/18 07:26, AKASHI Takahiro wrote:
 We need to prevent firmware-reserved memory regions, particularly EFI
 memory map as well as ACPI tables, from being corrupted by loading
 kernel/initrd (or other kexec buffers). We also want to support memory
 allocation in top-down manner in addition to default bottom-up.
 So let's have arm64 specific arch_kexec_walk_mem() which will search
 for available memory ranges in usable memblock list,
 i.e. !NOMAP & !reserved, 
>>>
 instead of system resource tree.
>>>
>>> Didn't we try to fix the system-resource-tree in order to fix regular-kexec 
>>> to
>>> be safe in the EFI-memory-map/ACPI-tables case?
>>>
>>> It would be good to avoid having two ways of doing this, and I would like to
>>> avoid having extra arch code...
>>
>> I know what you mean.
>> /proc/iomem or system resource is, in my opinion, not the best place to
>> describe memory usage of kernel but rather to describe *physical* hardware
>> layout. As we are still discussing about "reserved" memory, I don't want
>> to depend on it.

I agree. We have funny stuff that isn't hardware-layout, but is important for
the next boot. The kernel doesn't have an ABI to support when it queries the
list itself.


>> Along with memblock list, we will have more accurate control over memory
>> usage.

>>> If the argument is walking memblock gives a better answer than the stringy
>>> walk_system_ram_res() thing, is there any mileage in moving this code into
>>> kexec_file.c, and using it if !IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK)?
>>>
>>> This would save arm64/powerpc having near-identical implementations.
>>> 32bit arm keeps memblock if it has kexec, so it may be useful there too if
>>> kexec_file_load() support is added.

> If you don't have further objection, I will take memblock approach
> (with factoring out powerpc's arch_kexec_walk_mem()).

If we're agreed that the memblock walking is generic, then it would be quicker
to make the arm64 version as close as possible and merge them as a later series.
(saves a cross arch dependency)

With that,
Reviewed-by: James Morse 


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v9 03/11] arm64: kexec_file: invoke the kernel without purgatory

2018-05-15 Thread James Morse
Hi Akashi,

On 15/05/18 05:45, AKASHI Takahiro wrote:
> On Fri, May 11, 2018 at 06:03:49PM +0100, James Morse wrote:
>> On 07/05/18 06:22, AKASHI Takahiro wrote:
>>> On Tue, May 01, 2018 at 06:46:06PM +0100, James Morse wrote:
 On 25/04/18 07:26, AKASHI Takahiro wrote:
> diff --git a/arch/arm64/kernel/machine_kexec.c 
> b/arch/arm64/kernel/machine_kexec.c
> index f76ea92dff91..f7dbba00be10 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage)
>>
>   cpu_soft_restart(kimage != kexec_crash_image,
> - reboot_code_buffer_phys, kimage->head, kimage->start, 0);
> + reboot_code_buffer_phys, kimage->head, kimage->start,
> +#ifdef CONFIG_KEXEC_FILE
> + kimage->purgatory_info.purgatory_buf ?
> + 0 : kimage->arch.dtb_mem);
> +#else
> + 0);
> +#endif
>>
>>
 purgatory_buf seems to only be set in kexec_purgatory_setup_kbuf(), called 
 from
 kexec_load_purgatory(), which we don't use. How does this get a value?

 Would it be better to always use kimage->arch.dtb_mem, and ensure that is 
 0 for
 regular kexec (as we can't know where the dtb is)? (image_arg may then be a
 better name).
>>>
>>> The problem is arch.dtb_mem is currently defined only if CONFIG_KEXEC_FILE.
>>
>> I thought it was ARCH_HAS_KIMAGE_ARCH, which we can define all the time if
>> that's what we want.
>>
>>
>>> So I would like to
>>> - merge this patch with patch#8
>>> - change the condition
>>> #ifdef CONFIG_KEXEC_FILE
>>> kimage->file_mode ? 
>>> kimage->arch.dtb_mem : 0);
>>> #else
>>> 0);
>>> #endif
>>
>> If we can avoid even this #ifdef by always having kimage->arch, I'd prefer 
>> that.
>> If we do that 'dtb_mem' would need some thing that indicates its for 
>> kexec_file,
>> as kexec has a DTB too, we just don't know where it is...
> 
> OK, but I want to have a minimum of kexec.arch always exist.

I'm curious, why? Its 32bytes that is allocated a maximum of twice.

(my questions on what needs to go in there were because it looked like a third
user was missing...)


> How about this?
>
> | struct kimage_arch {
> | phys_addr_t dtb_mem;
> | #ifdef CONFIG_KEXEC_FILE

#ifdef in structs just breeds more #ifdefs, as the code that accesses those
members has to be behind the same set of conditions.

Given this, I prefer the #ifdefs around cpu_soft_restart() as it doesn't force
us to add more #ifdefs later.

For either option without purgatory_info:
Reviewed-by: James Morse 


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2] add a function(ioremap_encrypted) for kdump when AMD sme enabled.

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel
> by calling ioremap_encrypted().
> 
> Signed-off-by: Lianbo Jiang 
> ---
>  arch/x86/include/asm/io.h |  2 ++
>  arch/x86/mm/ioremap.c | 25 +
>  2 files changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index f6e5b93..06d2a9f 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -192,6 +192,8 @@ extern void __iomem *ioremap_cache(resource_size_t 
> offset, unsigned long size);
>  #define ioremap_cache ioremap_cache
>  extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long 
> size, unsigned long prot_val);
>  #define ioremap_prot ioremap_prot
> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned 
> long size);
> +#define ioremap_encrypted ioremap_encrypted
>  
>  /**
>   * ioremap -   map bus memory into CPU space
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545..7a52d1e 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -131,7 +131,8 @@ static void __ioremap_check_mem(resource_size_t addr, 
> unsigned long size,
>   * caller shouldn't need to know that small detail.
>   */
>  static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> - unsigned long size, enum page_cache_mode pcm, void *caller)
> + unsigned long size, enum page_cache_mode pcm,
> + void *caller, bool encrypted)
>  {
>   unsigned long offset, vaddr;
>   resource_size_t last_addr;
> @@ -199,7 +200,8 @@ static void __iomem *__ioremap_caller(resource_size_t 
> phys_addr,
>* resulting mapping.
>*/
>   prot = PAGE_KERNEL_IO;
> - if (sev_active() && mem_flags.desc_other)
> + if ((sev_active() && mem_flags.desc_other) ||
> + (encrypted && sme_active()))

You only need the encrypted check here.  If sme is not active,
then the pgprot_encrypted will basically be a no-op.

Also, extra indents.

Thanks,
Tom

>   prot = pgprot_encrypted(prot);
>  
>   switch (pcm) {
> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_nocache);
>  
> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, 
> unsigned long size)
>   enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>  
>   return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL_GPL(ioremap_uc);
>  
> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>  void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wc);
>  
> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>  void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_wt);
>  
> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long 
> size)
> +{
> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> + __builtin_return_address(0), true);
> +}
> +EXPORT_SYMBOL(ioremap_encrypted);
> +
>  void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>  {
>   return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_cache);
>  
> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, 
> unsigned long size,
>  {
>   return __ioremap_caller(phys_addr, size,
>   pgprot2cachemode(__pgprot(prot_val)),
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
>  }
>  EXPORT_SYMBOL(ioremap_prot);
>  
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

2018-05-15 Thread Tom Lendacky
On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel by
> calling ioremap_encrypted().
> 
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
> 
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the
> data(encrypted or decrypted). For example, when sme enabled, if the old
> memory is encrypted, we will remap the old memory in encrypted way, which
> will automatically decrypt the old memory encrypted when we read those data
> from the remapping address.
> 
>  --
> | first-kernel | second-kernel | kdump support |
> |  (mem_encrypt=on|off)|   (yes|no)| 
> |--+---+---|
> | on   | on| yes   |
> | off  | off   | yes   |
> | on   | off   | no|
> | off  | on| no|
> |__|___|___|
> 
> Test tools:
> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
> Author: Lianbo Jiang 
> Date:   Mon May 14 17:02:40 2018 +0800
> Note: This patch can only dump vmcore in the case of SME enabled.
> 
> crash-7.2.1: https://github.com/crash-utility/crash.git
> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
> Author: Dave Anderson 
> Date:   Fri May 11 15:54:32 2018 -0400
> 
> Test environment:
> HP ProLiant DL385Gen10 AMD EPYC 7251
> 8-Core Processor
> 32768 MB memory
> 600 GB disk space
> 
> Linux 4.17-rc4:
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> commit 75bc37fefc44 ("Linux 4.17-rc4")
> Author: Linus Torvalds 
> Date:   Sun May 6 16:57:38 2018 -1000
> 
> Reference:
> AMD64 Architecture Programmer's Manual
> https://support.amd.com/TechDocs/24593.pdf
> 

Have you also tested this with SEV?  It would be nice if the kdump
changes you make work with both SME and SEV.

Thanks,
Tom

> Lianbo Jiang (2):
>   add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>   support kdump when AMD secure memory encryption is active
> 
>  arch/x86/include/asm/dmi.h  | 14 +-
>  arch/x86/include/asm/io.h   |  2 ++
>  arch/x86/kernel/acpi/boot.c |  8 
>  arch/x86/kernel/crash_dump_64.c | 27 +++
>  arch/x86/mm/ioremap.c   | 25 +
>  drivers/acpi/tables.c   | 14 +-
>  drivers/iommu/amd_iommu_init.c  |  9 -
>  fs/proc/vmcore.c| 36 +++-
>  include/linux/crash_dump.h  |  4 
>  kernel/kexec_core.c | 12 
>  10 files changed, 135 insertions(+), 16 deletions(-)
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

2018-05-15 Thread kbuild test robot
Hi Lianbo,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.17-rc5 next-20180514]
[cannot apply to tip/x86/core]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Lianbo-Jiang/support-kdump-for-AMD-secure-memory-encryption-sme/20180515-135732
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/iommu/amd_iommu_init.c:899:27: sparse: incorrect type in assignment 
>> (different address spaces) @@expected struct dev_table_entry *old_devtb 
>> @@got truct dev_table_entry *old_devtb @@
   drivers/iommu/amd_iommu_init.c:899:27:expected struct dev_table_entry 
*old_devtb
   drivers/iommu/amd_iommu_init.c:899:27:got void [noderef] *
   drivers/iommu/amd_iommu_init.c:1740:39: sparse: expression using sizeof(void)
   drivers/iommu/amd_iommu_init.c:1740:39: sparse: expression using sizeof(void)
   drivers/iommu/amd_iommu_init.c:1750:49: sparse: expression using sizeof(void)
   drivers/iommu/amd_iommu_init.c:1750:49: sparse: expression using sizeof(void)
   drivers/iommu/amd_iommu_init.c:2950:18: sparse: symbol 'get_amd_iommu' was 
not declared. Should it be static?
   drivers/iommu/amd_iommu_init.c:2969:4: sparse: symbol 
'amd_iommu_pc_get_max_banks' was not declared. Should it be static?
   drivers/iommu/amd_iommu_init.c:2980:6: sparse: symbol 
'amd_iommu_pc_supported' was not declared. Should it be static?
   drivers/iommu/amd_iommu_init.c:2986:4: sparse: symbol 
'amd_iommu_pc_get_max_counters' was not declared. Should it be static?
   drivers/iommu/amd_iommu_init.c:3035:5: sparse: symbol 'amd_iommu_pc_get_reg' 
was not declared. Should it be static?
   drivers/iommu/amd_iommu_init.c:3044:5: sparse: symbol 'amd_iommu_pc_set_reg' 
was not declared. Should it be static?

vim +899 drivers/iommu/amd_iommu_init.c

   854  
   855  
   856  static bool copy_device_table(void)
   857  {
   858  u64 int_ctl, int_tab_len, entry = 0, last_entry = 0;
   859  struct dev_table_entry *old_devtb = NULL;
   860  u32 lo, hi, devid, old_devtb_size;
   861  phys_addr_t old_devtb_phys;
   862  struct amd_iommu *iommu;
   863  u16 dom_id, dte_v, irq_v;
   864  gfp_t gfp_flag;
   865  u64 tmp;
   866  
   867  if (!amd_iommu_pre_enabled)
   868  return false;
   869  
   870  pr_warn("Translation is already enabled - trying to copy 
translation structures\n");
   871  for_each_iommu(iommu) {
   872  /* All IOMMUs should use the same device table with the 
same size */
   873  lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
   874  hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 
4);
   875  entry = (((u64) hi) << 32) + lo;
   876  if (last_entry && last_entry != entry) {
   877  pr_err("IOMMU:%d should use the same dev table 
as others!\n",
   878  iommu->index);
   879  return false;
   880  }
   881  last_entry = entry;
   882  
   883  old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
   884  if (old_devtb_size != dev_table_size) {
   885  pr_err("The device table size of IOMMU:%d is 
not expected!\n",
   886  iommu->index);
   887  return false;
   888  }
   889  }
   890  
   891  old_devtb_phys = entry & PAGE_MASK;
   892  if (sme_active() && is_kdump_kernel())
   893  old_devtb_phys = __sme_clr(old_devtb_phys);
   894  if (old_devtb_phys >= 0x1ULL) {
   895  pr_err("The address of old device table is above 4G, 
not trustworthy!\n");
   896  return false;
   897  }
   898  if (sme_active() && is_kdump_kernel())
 > 899  old_devtb = ioremap_encrypted(old_devtb_phys,
   900  dev_table_size);
   901  else
   902  old_devtb = memremap(old_devtb_phys,
   903  dev_table_size, MEMREMAP_WB);
   904  if (!old_devtb)
   905  return false;
   906  
   907  gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32;
   908  old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag,
   909  get_order(dev_table_size));
   910  i

Re: [PATCH 0/2] Remove obsolete kdump tool and add missing uninstall rule

2018-05-15 Thread Bhupesh Sharma
Hello Russell, Simon,

On Mon, Apr 23, 2018 at 10:30 AM, Bhupesh Sharma  wrote:
> This patchset contains two patches:
>
> [1/2] - Adds a missing uninstall rule to 'Makefile.in' to allow easier
> uninstallation of executables and man pages installed via
> 'make install'
> [2/2] - Removes obsolete kdump tool as per the discussion on kexec
> mailing list (see [1] for details), where a conclusion was
> reached that with the availability of tools like crash/gdb the
> analysis of the crashdump core has become rather easy, and it
> makes the kdump tool obsolete.
>
> Also the same naming convention (man page) causes confusion
> when compared to similarly named distribution specific kdump
> service/utilities, so its better to remove the same from
> kexec-tools.
>
> I tested the patchset on my apm mustang and qualcomm arm64 boards and
> the changes seem to work fine.
>
> [1] http://lists.infradead.org/pipermail/kexec/2018-April/020483.html
>
> Cc: Russell King 
> Cc: Simon Horman 
> Cc: Dave Young 
> Cc: Vivek Goyal 
> Cc: AKASHI Takahiro 

Ping. Any comments on this?

Thanks,
Bhupesh

> Bhupesh Sharma (2):
>   Makefile.in: Add uninstall rule
>   Remove obsolete kdump tool
>
>  Makefile.in |  92 +--
>  kdump/Makefile  |  30 -
>  kdump/kdump.8   |  39 ---
>  kdump/kdump.c   | 327 
> 
>  kexec-tools.spec.in |   2 -
>  5 files changed, 83 insertions(+), 407 deletions(-)
>  delete mode 100644 kdump/Makefile
>  delete mode 100644 kdump/kdump.8
>  delete mode 100644 kdump/kdump.c
>
> --
> 2.7.4
>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] arm64: Add support to supply 'kaslr-seed' to secondary kernel

2018-05-15 Thread Bhupesh Sharma
Hello Akashi,

On Thu, Apr 26, 2018 at 2:27 PM, Bhupesh Sharma  wrote:
> This patch adds the support to supply 'kaslr-seed' to secondary kernel,
> when we do a 'kexec warm reboot to another kernel' (although the
> behaviour remains the same for the 'kdump' case as well) on arm64
> platforms using the 'kexec_load' invocation method.
>
> Lets consider the case where the primary kernel working on the arm64
> platform supports kaslr (i.e 'CONFIG_RANDOMIZE_BASE' was set to y and
> we have a compliant EFI firmware which supports EFI_RNG_PROTOCOL and
> hence can pass a non-zero (valid) seed to the primary kernel).
>
> Now the primary kernel reads the 'kaslr-seed' and wipes it to 0 and
> uses the seed value to randomize for e.g. the module base address
> offset.
>
> In the case of 'kexec_load' (or even kdump for brevity),
> we rely on the user-space kexec-tools to pass an appropriate dtb to the
> secondary kernel and since 'kaslr-seed' is wiped to 0 by the primary
> kernel, the secondary will essentially work with *nokaslr* as
> 'kaslr-seed' is set to 0 when it is passed to the secondary kernel.
>
> This can be true even in case the secondary kernel had
> 'CONFIG_RANDOMIZE_BASE' and 'CONFIG_RANDOMIZE_MODULE_REGION_FULL' set to
> y.
>
> This patch addresses this issue by first checking if the device tree
> provided by the firmware to the kernel supports the 'kaslr-seed'
> property and verifies that it is really wiped to 0. If this condition is
> met, it fixes up the 'kaslr-seed' property by using the getrandom()
> syscall to get a suitable random number.
>
> I verified this patch on my Qualcomm arm64 board and here are some test
> results:
>
> 1. Ensure that the primary kernel is boot'ed with 'kaslr-seed'
>dts property and it is really wiped to 0:
>
>[root@qualcomm-amberwing]# dtc -I dtb -O dts /sys/firmware/fdt | grep -A 
> 10 -i chosen
> chosen {
> kaslr-seed = <0x0 0x0>;
> ...
> }
>
> 2. Now issue 'kexec_load' to load the secondary kernel (let's assume
>that we are using the same kernel as the secondary kernel):
># kexec -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname
>  -r`.img --reuse-cmdline -d
>
> 3. Issue 'kexec -e' to warm boot to the secondary:
># kexec -e
>
> 4. Now after the secondary boots, confirm that the load address of the
>modules is randomized in every successive boot:
>
>[root@qualcomm-amberwing]# cat /proc/modules
>sunrpc 524288 1 - Live 0x0307db19
>vfat 262144 1 - Live 0x0307db11
>fat 262144 1 vfat, Live 0x0307db09
>crc32_ce 262144 0 - Live 0x0307d8c7
>...
>
> Signed-off-by: Bhupesh Sharma 
> ---
> Changes since v1:
>  - Addressed Akashi's comments regarding the goto label path.
>  - v1 can be viewed here: https://marc.info/?l=kexec&m=152373724406110&w=2

Ping. Any comments on this v2?

Thanks,
Bhupesh

>  kexec/arch/arm64/kexec-arm64.c | 138 
> +
>  1 file changed, 100 insertions(+), 38 deletions(-)
>
> diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
> index 62f37585b788..a206c172b1aa 100644
> --- a/kexec/arch/arm64/kexec-arm64.c
> +++ b/kexec/arch/arm64/kexec-arm64.c
> @@ -15,6 +15,11 @@
>  #include 
>  #include 
>
> +#include 
> +#include 
> +#include 
> +#include 
> +
>  #include "kexec.h"
>  #include "kexec-arm64.h"
>  #include "crashdump.h"
> @@ -392,11 +397,13 @@ static int fdt_setprop_range(void *fdt, int nodeoffset,
>  static int setup_2nd_dtb(struct dtb *dtb, char *command_line, int on_crash)
>  {
> uint32_t address_cells, size_cells;
> -   int range_len;
> -   int nodeoffset;
> +   uint64_t fdt_val64;
> +   uint64_t *prop;
> char *new_buf = NULL;
> +   int len, range_len;
> +   int nodeoffset;
> int new_size;
> -   int result;
> +   int result, kaslr_seed;
>
> result = fdt_check_header(dtb->buf);
>
> @@ -407,47 +414,103 @@ static int setup_2nd_dtb(struct dtb *dtb, char 
> *command_line, int on_crash)
>
> result = set_bootargs(dtb, command_line);
>
> -   if (on_crash) {
> -   /* determine #address-cells and #size-cells */
> -   result = get_cells_size(dtb->buf, &address_cells, 
> &size_cells);
> -   if (result) {
> -   fprintf(stderr,
> -   "kexec: cannot determine cells-size.\n");
> -   result = -EINVAL;
> -   goto on_error;
> -   }
> +   /* determine #address-cells and #size-cells */
> +   result = get_cells_size(dtb->buf, &address_cells, &size_cells);
> +   if (result) {
> +   fprintf(stderr, "kexec: cannot determine cells-size.\n");
> +   result = -EINVAL;
> +   goto on_error;
> +   }
>
> -   if (!cells_size_fitted(address_cells, size_cells,
> -   &elfcorehdr_mem))