Re: [RFC] arm64: extra entries in /proc/iomem for kexec
On Tue, Apr 24, 2018 at 05:08:57PM +0100, James Morse wrote: > Hi Akashi, > > On 16/04/18 11:08, AKASHI Takahiro wrote: > > On Thu, Apr 12, 2018 at 05:01:52PM +0100, James Morse wrote: > >> On 05/04/18 03:42, AKASHI Takahiro wrote: > >>> On Mon, Apr 02, 2018 at 10:53:32AM +0900, AKASHI Takahiro wrote: > Basically, changes that I made on /proc/iomem in my new format D were: > 1. to move NOMAP region entries, formerly named "reserved" and now named > "reserved (no map)", under System RAM > 2. to add new entries for firmware-reserved regions as "reserved" also > under System RAM > > On the other hand, current kexec-tools, in particular kexec command, > only scan top-level "System RAM" entries as well as "reserved" entries. > >> > >> as well as? > > > > I had few words here. > > The current kexec-tools assumes that "reserved" entries appear only > > at the top level. So, > > > >> Does this mean kexec will pick up the reserved region if its written as: > >> | 1000-0009d7ff : System RAM > >> |1000-1fff : reserved > > > > if this is the case, the range "0x1000-0x1fff" is added to an internal > > list of memory ranges > > I found this in get_memory_ranges_iomem_cb()... > > > > but will later be *ignored* by locate_hole() function > > due to its memory type. > > Ugh. Great. > > > > That is, the range can potentially be overwritten by loaded kernel/initrd. > > So two kernel bugs, one user-space bug, all conspiring. > > > either because > a. new kernel (or initrd/dtb) may have been allocated on a NOMAP region > which are not suitable for usable memory, or > b. new kernel (or initrd/dtb) may have been allocated on a reserved > region > whose contents can be overwritten. > > While we see (b) even today, (a) is a backward compatibility issue. > >> > >> (a) doesn't happen because request_standard_resources() checks > >> memblock_is_nomap(), and reports those regions as 'reserved'. > > > > I might have confused you. The assumption here was that we adopt format (D), > > where all NOMAP regions are sub nodes of "System RAM", but still use > > the current kexec-tools. > > As I said above, this will end up an un-expected behavior. > > I'd like to fix this without having to fix user-space at the same time. It > looks > like no-one else has second level reserved regions, This was my assumption when I sent out a patch to kexec-tools. > so we can't blame > kexec-tools for looking straight at them, then ignoring them. > > > >> We can't expect user-space to upgrade to fix this issue. > > > > I'm not sure what you mean here; we can't fix the issue anyway > > without changing user-space/kexec-tools as kexec_load system call totally > > relies on parameters passed by kexec-tools. > > (The only difference is whether we need additional kernel changes or not.) > > It looks like this was always broken because the efi memory map isn't listed > as > 'reserved' in /proc/iomem. The fallout for the new stuff is secondary. > > > >>> # I don't know yet whether people are happy with this fix, and also have > >>> kernel patches for my other approaches. They are neither not much > >>> complicated. > >> > >> I don't think we should fix this in userspace, exporting all the > >> memblock_reserved() regions as 'reserved' in /proc/iomem looks like the > >> right > >> thing to do. > > > > Again, if you modify /proc/iomem, you have to update kexec-tools, too. > > If we squash the memblock_reserved() stuff down so it appears as a top level > 'reserved' region too, I don't think we do. If I correctly understand, you're talking about my format (E). As I said, it will fix the issue without modifying user-space, but || This does not only look quite noisy but also ignores the fact that || reserved regions are part of System RAM (or memblock.memory). > This prevents the efi-memory-map > being overwritten on kernels since kexec was merged. > > Its horribly fiddly to do this. The kernel code/data are special reserved > regions that we already describe as a subset of system-ram, even though they > are > both also fragments of a bigger memblock_reserved() block. Actually, we don't have to avoid kernel code/data regions as copying loaded data to the final destinations will be done at the very end of kexec. > While we can walk memblock for regions that aren't reserved, allocating memory > in the loop changes what is reserved. That one O(N) walk ends up being four... At most O(n^2)? Thanks, -Takhairo AKASHI > I'm almost done tearing my hair out, I should have a working patch soon... > > > >> wasn't there going to be another version, with the core EFI > >> stuff split out? > > > > ? I don't remember well ... > > https://lkml.org/lkml/2018/2/1/496 > > > Thanks, > > James > ___ kexec mailing list kexec@lists.infradead.org
[PATCH v9 11/11] arm64: kexec_file: add kaslr support
Adding "kalsr-seed" to dtb enables triggering kaslr, or kernel virtual address randomization, at secondary kernel boot. We always do this as it wll have no harm on kaslr-incapable kernel. We don't have any "switch" to turn off this feature directly, but still can suppress it by passing "nokaslr" as a kernel boot argument. Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/kernel/machine_kexec_file.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index ec674f4d267c..762f9102899c 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -246,6 +247,12 @@ static int setup_dtb(struct kimage *image, goto out_err; } + /* add kaslr-seed */ + get_random_bytes(, sizeof(value)); + ret = fdt_setprop(buf, nodeoffset, "kaslr-seed", , sizeof(value)); + if (ret) + goto out_err; + /* trim a buffer */ fdt_pack(buf); *dtb_buf = buf; -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 09/11] include: pe.h: remove message[] from mz header definition
message[] field won't be part of the definition of mz header. This change is crucial for enabling kexec_file_load on arm64 because arm64's "Image" binary, as in PE format, doesn't have any data for it and accordingly the following check in pefile_parse_binary() will fail: chkaddr(cursor, mz->peaddr, sizeof(*pe)); Signed-off-by: AKASHI TakahiroReviewed-by: Ard Biesheuvel Cc: David Howells Cc: Vivek Goyal Cc: Herbert Xu Cc: David S. Miller --- include/linux/pe.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/pe.h b/include/linux/pe.h index 143ce75be5f0..3482b18a48b5 100644 --- a/include/linux/pe.h +++ b/include/linux/pe.h @@ -166,7 +166,7 @@ struct mz_hdr { uint16_t oem_info; /* oem specific */ uint16_t reserved1[10]; /* reserved */ uint32_t peaddr;/* address of pe header */ - char message[64]; /* message to print */ + char message[]; /* message to print */ }; struct mz_reloc { -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 08/11] arm64: enable KEXEC_FILE config
Modify arm64/Kconfig to enable kexec_file_load support. Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/Kconfig | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index eb2cf4938f6d..d8f0dcdb8b96 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -847,6 +847,16 @@ config KEXEC but it is independent of the system firmware. And like a reboot you can start any kernel with it, not just Linux. +config KEXEC_FILE + bool "kexec file based system call" + select KEXEC_CORE + select BUILD_BIN2C + help + This is new version of kexec system call. This system call is + file based and takes file descriptors as system call argument + for kernel and initramfs as opposed to list of segments as + accepted by previous system call. + config CRASH_DUMP bool "Build kdump crash kernel" help -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 10/11] arm64: kexec_file: add kernel signature verification support
With this patch, kernel verification can be done without IMA security subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead. On x86, a signature is embedded into a PE file (Microsoft's format) header of binary. Since arm64's "Image" can also be seen as a PE file as far as CONFIG_EFI is enabled, we adopt this format for kernel signing. You can create a signed kernel image with: $ sbsign --key ${KEY} --cert ${CERT} Image Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/Kconfig | 24 arch/arm64/include/asm/kexec.h | 16 arch/arm64/kernel/kexec_image.c | 15 +++ 3 files changed, 55 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d8f0dcdb8b96..5c772601840d 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -857,6 +857,30 @@ config KEXEC_FILE for kernel and initramfs as opposed to list of segments as accepted by previous system call. +config KEXEC_VERIFY_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on KEXEC_FILE + help + Select this option to verify a signature with loaded kernel + image. If configured, any attempt of loading a image without + valid signature will fail. + + In addition to that option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + +config KEXEC_IMAGE_VERIFY_SIG + bool "Enable Image signature verification support" + default y + depends on KEXEC_VERIFY_SIG + depends on EFI && SIGNED_PE_FILE_VERIFICATION + help + Enable Image signature verification support. + +comment "Image signature verification is missing yet" + depends on KEXEC_VERIFY_SIG + depends on !EFI || !SIGNED_PE_FILE_VERIFICATION + config CRASH_DUMP bool "Build kdump crash kernel" help diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 77f05bcf6a42..891f2484969d 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -133,6 +133,7 @@ struct arm64_image_header { }; static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U}; +static const u8 arm64_image_pe_sig[2] = {'M', 'Z'}; /** * arm64_header_check_magic - Helper to check the arm64 image header. @@ -154,6 +155,21 @@ static inline int arm64_header_check_magic(const struct arm64_image_header *h) && h->magic[3] == arm64_image_magic[3]); } +/** + * arm64_header_check_pe_sig - Helper to check the arm64 image header. + * + * Returns non-zero if 'MZ' signature is found. + */ + +static inline int arm64_header_check_pe_sig(const struct arm64_image_header *h) +{ + if (!h) + return 0; + + return (h->pe_sig[0] == arm64_image_pe_sig[0] + && h->pe_sig[1] == arm64_image_pe_sig[1]); +} + extern const struct kexec_file_ops kexec_image_ops; struct kimage; diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c index 2b3baf7285e0..7c11beefe65f 100644 --- a/arch/arm64/kernel/kexec_image.c +++ b/arch/arm64/kernel/kexec_image.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -24,6 +25,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len) if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h)) return -EINVAL; + pr_debug("PE format: %s\n", + (arm64_header_check_pe_sig(h) ? "yes" : "no")); + return 0; } @@ -78,7 +82,18 @@ static void *image_load(struct kimage *image, return ERR_PTR(ret); } +#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG +static int image_verify_sig(const char *kernel, unsigned long kernel_len) +{ + return verify_pefile_signature(kernel, kernel_len, NULL, + VERIFYING_KEXEC_PE_SIGNATURE); +} +#endif + const struct kexec_file_ops kexec_image_ops = { .probe = image_probe, .load = image_load, +#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG + .verify_sig = image_verify_sig, +#endif }; -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 07/11] arm64: kexec_file: add crash dump support
Enabling crash dump (kdump) includes * prepare contents of ELF header of a core dump file, /proc/vmcore, using crash_prepare_elf64_headers(), and * add two device tree properties, "linux,usable-memory-range" and "linux,elfcorehdr", which represent repsectively a memory range to be used by crash dump kernel and the header's location Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/include/asm/kexec.h | 4 + arch/arm64/kernel/kexec_image.c| 9 +- arch/arm64/kernel/machine_kexec_file.c | 202 + 3 files changed, 213 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 3cba4161818a..77f05bcf6a42 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -100,6 +100,10 @@ struct kimage_arch { int kern_segment; phys_addr_t dtb_mem; void *dtb_buf; + /* Core ELF header buffer */ + void *elf_headers; + unsigned long elf_headers_sz; + unsigned long elf_load_addr; }; /** diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c index 4dd524ad6611..2b3baf7285e0 100644 --- a/arch/arm64/kernel/kexec_image.c +++ b/arch/arm64/kernel/kexec_image.c @@ -39,8 +39,13 @@ static void *image_load(struct kimage *image, /* Load the kernel */ kbuf.image = image; - kbuf.buf_min = 0; - kbuf.buf_max = ULONG_MAX; + if (image->type == KEXEC_TYPE_CRASH) { + kbuf.buf_min = crashk_res.start; + kbuf.buf_max = crashk_res.end + 1; + } else { + kbuf.buf_min = 0; + kbuf.buf_max = ULONG_MAX; + } kbuf.top_down = false; kbuf.buffer = kernel; diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index 37c0a9dc2e47..ec674f4d267c 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -17,6 +17,7 @@ #include #include #include +#include #include static int __dt_root_addr_cells; @@ -32,6 +33,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image) vfree(image->arch.dtb_buf); image->arch.dtb_buf = NULL; + vfree(image->arch.elf_headers); + image->arch.elf_headers = NULL; + image->arch.elf_headers_sz = 0; + return kexec_image_post_load_cleanup_default(image); } @@ -76,6 +81,78 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, return ret; } +static int __init arch_kexec_file_init(void) +{ + /* Those values are used later on loading the kernel */ + __dt_root_addr_cells = dt_root_addr_cells; + __dt_root_size_cells = dt_root_size_cells; + + return 0; +} +late_initcall(arch_kexec_file_init); + +#define FDT_ALIGN(x, a)(((x) + (a) - 1) & ~((a) - 1)) +#define FDT_TAGALIGN(x)(FDT_ALIGN((x), FDT_TAGSIZE)) + +static int fdt_prop_len(const char *prop_name, int len) +{ + return (strlen(prop_name) + 1) + + sizeof(struct fdt_property) + + FDT_TAGALIGN(len); +} + +static bool cells_size_fitted(unsigned long base, unsigned long size) +{ + /* if *_cells >= 2, cells can hold 64-bit values anyway */ + if ((__dt_root_addr_cells == 1) && (base >= (1ULL << 32))) + return false; + + if ((__dt_root_size_cells == 1) && (size >= (1ULL << 32))) + return false; + + return true; +} + +static void fill_property(void *buf, u64 val64, int cells) +{ + u32 val32; + + if (cells == 1) { + val32 = cpu_to_fdt32((u32)val64); + memcpy(buf, , sizeof(val32)); + } else { + memset(buf, 0, cells * sizeof(u32) - sizeof(u64)); + buf += cells * sizeof(u32) - sizeof(u64); + + val64 = cpu_to_fdt64(val64); + memcpy(buf, , sizeof(val64)); + } +} + +static int fdt_setprop_range(void *fdt, int nodeoffset, const char *name, + unsigned long addr, unsigned long size) +{ + void *buf, *prop; + size_t buf_size; + int result; + + buf_size = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32); + prop = buf = vmalloc(buf_size); + if (!buf) + return -ENOMEM; + + fill_property(prop, addr, __dt_root_addr_cells); + prop += __dt_root_addr_cells * sizeof(u32); + + fill_property(prop, size, __dt_root_size_cells); + + result = fdt_setprop(fdt, nodeoffset, name, buf, buf_size); + + vfree(buf); + + return result; +} + static int setup_dtb(struct kimage *image, unsigned long initrd_load_addr, unsigned long initrd_len, char *cmdline, unsigned long cmdline_len, @@ -88,10 +165,26 @@ static int setup_dtb(struct kimage *image, int range_len;
[PATCH v9 05/11] arm64: kexec_file: load initrd and device-tree
load_other_segments() is expected to allocate and place all the necessary memory segments other than kernel, including initrd and device-tree blob (and elf core header for crash). While most of the code was borrowed from kexec-tools' counterpart, users may not be allowed to specify dtb explicitly, instead, the dtb presented by a boot loader is reused. arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64- specific data allocated in load_other_segments(). Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/include/asm/kexec.h | 16 +++ arch/arm64/kernel/machine_kexec_file.c | 160 + 2 files changed, 176 insertions(+) diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index e17f0529a882..e4de1223715f 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -93,6 +93,22 @@ static inline void crash_prepare_suspend(void) {} static inline void crash_post_resume(void) {} #endif +#ifdef CONFIG_KEXEC_FILE +#define ARCH_HAS_KIMAGE_ARCH + +struct kimage_arch { + int kern_segment; + phys_addr_t dtb_mem; + void *dtb_buf; +}; + +struct kimage; + +extern int load_other_segments(struct kimage *image, + char *initrd, unsigned long initrd_len, + char *cmdline, unsigned long cmdline_len); +#endif + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index f9ebf54ca247..b3b9b1725d8a 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -13,7 +13,26 @@ #include #include #include +#include #include +#include +#include +#include + +static int __dt_root_addr_cells; +static int __dt_root_size_cells; + +const struct kexec_file_ops * const kexec_file_loaders[] = { + NULL +}; + +int arch_kimage_file_post_load_cleanup(struct kimage *image) +{ + vfree(image->arch.dtb_buf); + image->arch.dtb_buf = NULL; + + return kexec_image_post_load_cleanup_default(image); +} int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(struct resource *, void *)) @@ -55,3 +74,144 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, return ret; } + +static int setup_dtb(struct kimage *image, + unsigned long initrd_load_addr, unsigned long initrd_len, + char *cmdline, unsigned long cmdline_len, + char **dtb_buf, size_t *dtb_buf_len) +{ + char *buf = NULL; + size_t buf_size; + int nodeoffset; + u64 value; + int range_len; + int ret; + + /* duplicate dt blob */ + buf_size = fdt_totalsize(initial_boot_params); + range_len = (__dt_root_addr_cells + __dt_root_size_cells) * sizeof(u32); + + if (initrd_load_addr) + buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64)) + + fdt_prop_len("linux,initrd-end", sizeof(u64)); + + if (cmdline) + buf_size += fdt_prop_len("bootargs", cmdline_len + 1); + + buf = vmalloc(buf_size); + if (!buf) { + ret = -ENOMEM; + goto out_err; + } + + ret = fdt_open_into(initial_boot_params, buf, buf_size); + if (ret) + goto out_err; + + nodeoffset = fdt_path_offset(buf, "/chosen"); + if (nodeoffset < 0) + goto out_err; + + /* add bootargs */ + if (cmdline) { + ret = fdt_setprop(buf, nodeoffset, "bootargs", + cmdline, cmdline_len + 1); + if (ret) + goto out_err; + } + + /* add initrd-* */ + if (initrd_load_addr) { + value = cpu_to_fdt64(initrd_load_addr); + ret = fdt_setprop(buf, nodeoffset, "linux,initrd-start", + , sizeof(value)); + if (ret) + goto out_err; + + value = cpu_to_fdt64(initrd_load_addr + initrd_len); + ret = fdt_setprop(buf, nodeoffset, "linux,initrd-end", + , sizeof(value)); + if (ret) + goto out_err; + } + + /* trim a buffer */ + fdt_pack(buf); + *dtb_buf = buf; + *dtb_buf_len = fdt_totalsize(buf); + + return 0; + +out_err: + vfree(buf); + return ret; +} + +int load_other_segments(struct kimage *image, + char *initrd, unsigned long initrd_len, + char *cmdline, unsigned long cmdline_len) +{ + struct kexec_segment *kern_seg; + struct kexec_buf kbuf; + unsigned long initrd_load_addr = 0; + char *dtb = NULL; + unsigned long dtb_len = 0; + int ret = 0; + +
[PATCH v9 06/11] arm64: kexec_file: allow for loading Image-format kernel
This patch provides kexec_file_ops for "Image"-format kernel. In this implementation, a binary is always loaded with a fixed offset identified in text_offset field of its header. Regarding signature verification for trusted boot, this patch doesn't contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later in this series, but file-attribute-based verification is still a viable option by enabling IMA security subsystem. You can sign(label) a to-be-kexec'ed kernel image on target file system with: $ evmctl ima_sign --key /path/to/private_key.pem Image On live system, you must have IMA enforced with, at least, the following security policy: "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" See more details about IMA here: https://sourceforge.net/p/linux-ima/wiki/Home/ Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/include/asm/kexec.h | 50 arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/kexec_image.c| 79 ++ arch/arm64/kernel/machine_kexec_file.c | 1 + 4 files changed, 131 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/kexec_image.c diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index e4de1223715f..3cba4161818a 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -102,6 +102,56 @@ struct kimage_arch { void *dtb_buf; }; +/** + * struct arm64_image_header - arm64 kernel image header + * + * @pe_sig: Optional PE format 'MZ' signature + * @branch_code: Instruction to branch to stext + * @text_offset: Image load offset, little endian + * @image_size: Effective image size, little endian + * @flags: + * Bit 0: Kernel endianness. 0=little endian, 1=big endian + * @reserved: Reserved + * @magic: Magic number, "ARM\x64" + * @pe_header: Optional offset to a PE format header + **/ + +struct arm64_image_header { + u8 pe_sig[2]; + u8 pad[2]; + u32 branch_code; + u64 text_offset; + u64 image_size; + u64 flags; + u64 reserved[3]; + u8 magic[4]; + u32 pe_header; +}; + +static const u8 arm64_image_magic[4] = {'A', 'R', 'M', 0x64U}; + +/** + * arm64_header_check_magic - Helper to check the arm64 image header. + * + * Returns non-zero if header is OK. + */ + +static inline int arm64_header_check_magic(const struct arm64_image_header *h) +{ + if (!h) + return 0; + + if (!h->text_offset) + return 0; + + return (h->magic[0] == arm64_image_magic[0] + && h->magic[1] == arm64_image_magic[1] + && h->magic[2] == arm64_image_magic[2] + && h->magic[3] == arm64_image_magic[3]); +} + +extern const struct kexec_file_ops kexec_image_ops; + struct kimage; extern int load_other_segments(struct kimage *image, diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 2f2b2757ae7a..1e110aa571dd 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -50,7 +50,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE)+= kaslr.o arm64-obj-$(CONFIG_HIBERNATION)+= hibernate.o hibernate-asm.o arm64-obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \ cpu-reset.o -arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o +arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o kexec_image.o arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c new file mode 100644 index ..4dd524ad6611 --- /dev/null +++ b/arch/arm64/kernel/kexec_image.c @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Kexec image loader + + * Copyright (C) 2018 Linaro Limited + * Author: AKASHI Takahiro + */ + +#define pr_fmt(fmt)"kexec_file(Image): " fmt + +#include +#include +#include +#include +#include +#include + +static int image_probe(const char *kernel_buf, unsigned long kernel_len) +{ + const struct arm64_image_header *h; + + h = (const struct arm64_image_header *)(kernel_buf); + + if ((kernel_len < sizeof(*h)) || !arm64_header_check_magic(h)) + return -EINVAL; + + return 0; +} + +static void *image_load(struct kimage *image, + char *kernel, unsigned long kernel_len, + char *initrd, unsigned long initrd_len, + char *cmdline, unsigned long cmdline_len) +{ + struct kexec_buf kbuf; + struct arm64_image_header *h = (struct arm64_image_header *)kernel; + unsigned long text_offset; + int ret; + +
[PATCH v9 04/11] arm64: kexec_file: allocate memory walking through memblock list
We need to prevent firmware-reserved memory regions, particularly EFI memory map as well as ACPI tables, from being corrupted by loading kernel/initrd (or other kexec buffers). We also want to support memory allocation in top-down manner in addition to default bottom-up. So let's have arm64 specific arch_kexec_walk_mem() which will search for available memory ranges in usable memblock list, i.e. !NOMAP & !reserved, instead of system resource tree. Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/kernel/Makefile | 3 +- arch/arm64/kernel/machine_kexec_file.c | 57 ++ 2 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/kernel/machine_kexec_file.c diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index bf825f38d206..2f2b2757ae7a 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -48,8 +48,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o arm64-obj-$(CONFIG_HIBERNATION)+= hibernate.o hibernate-asm.o -arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ +arm64-obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \ cpu-reset.o +arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c new file mode 100644 index ..f9ebf54ca247 --- /dev/null +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * kexec_file for arm64 + * + * Copyright (C) 2018 Linaro Limited + * Author: AKASHI Takahiro + * + * Most code is derived from arm64 port of kexec-tools + */ + +#define pr_fmt(fmt) "kexec_file: " fmt + +#include +#include +#include +#include + +int arch_kexec_walk_mem(struct kexec_buf *kbuf, + int (*func)(struct resource *, void *)) +{ + phys_addr_t start, end; + struct resource res; + u64 i; + int ret = 0; + + if (kbuf->image->type == KEXEC_TYPE_CRASH) + return func(_res, kbuf); + + if (kbuf->top_down) + for_each_mem_range_rev(i, , , + NUMA_NO_NODE, MEMBLOCK_NONE, + , , NULL) { + if (!memblock_is_map_memory(start)) + continue; + + res.start = start; + res.end = end; + ret = func(, kbuf); + if (ret) + break; + } + else + for_each_mem_range(i, , , + NUMA_NO_NODE, MEMBLOCK_NONE, + , , NULL) { + if (!memblock_is_map_memory(start)) + continue; + + res.start = start; + res.end = end; + ret = func(, kbuf); + if (ret) + break; + } + + return ret; +} -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 02/11] kexec_file: make kexec_image_post_load_cleanup_default() global
Change this function from static to global so that arm64 can implement its own arch_kimage_file_post_load_cleanup() later using kexec_image_post_load_cleanup_default(). Signed-off-by: AKASHI TakahiroCc: Dave Young Cc: Vivek Goyal Cc: Baoquan He --- include/linux/kexec.h | 1 + kernel/kexec_file.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 9e4e638fb505..49ab758f4d91 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -143,6 +143,7 @@ extern const struct kexec_file_ops * const kexec_file_loaders[]; int kexec_image_probe_default(struct kimage *image, void *buf, unsigned long buf_len); +int kexec_image_post_load_cleanup_default(struct kimage *image); /** * struct kexec_buf - parameters for finding a place for a buffer in memory diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 75d8e7cf040e..eef89d9b1f03 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -78,7 +78,7 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image) return kexec_image_load_default(image); } -static int kexec_image_post_load_cleanup_default(struct kimage *image) +int kexec_image_post_load_cleanup_default(struct kimage *image) { if (!image->fops || !image->fops->cleanup) return 0; -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 03/11] arm64: kexec_file: invoke the kernel without purgatory
On arm64, purugatory would do almosty nothing. So just invoke secondary kernel directy by jumping into its entry code. While, in this case, cpu_soft_restart() must be called with dtb address in the fifth argument, the behavior still stays compatible with kexec_load case as long as the argument is null. Signed-off-by: AKASHI TakahiroCc: Catalin Marinas Cc: Will Deacon --- arch/arm64/kernel/cpu-reset.S | 6 +++--- arch/arm64/kernel/machine_kexec.c | 11 +-- arch/arm64/kernel/relocate_kernel.S | 3 ++- 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 8021b46c9743..391df91328ac 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S @@ -24,9 +24,9 @@ * * @el2_switch: Flag to indicate a swich to EL2 is needed. * @entry: Location to jump to for soft reset. - * arg0: First argument passed to @entry. - * arg1: Second argument passed to @entry. - * arg2: Third argument passed to @entry. + * arg0: First argument passed to @entry. (relocation list) + * arg1: Second argument passed to @entry.(physcal kernel entry) + * arg2: Third argument passed to @entry. (physical dtb address) * * Put the CPU into the same state as it would be if it had been reset, and * branch to what would be the reset vector. It must be executed with the diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c index f76ea92dff91..f7dbba00be10 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -205,10 +205,17 @@ void machine_kexec(struct kimage *kimage) * uses physical addressing to relocate the new image to its final * position and transfers control to the image entry point when the * relocation is complete. +* In case of kexec_file_load syscall, we directly start the kernel, +* skipping purgatory. */ - cpu_soft_restart(kimage != kexec_crash_image, - reboot_code_buffer_phys, kimage->head, kimage->start, 0); + reboot_code_buffer_phys, kimage->head, kimage->start, +#ifdef CONFIG_KEXEC_FILE + kimage->purgatory_info.purgatory_buf ? + 0 : kimage->arch.dtb_mem); +#else + 0); +#endif BUG(); /* Should never get here. */ } diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S index f407e422a720..95fd94209aae 100644 --- a/arch/arm64/kernel/relocate_kernel.S +++ b/arch/arm64/kernel/relocate_kernel.S @@ -32,6 +32,7 @@ ENTRY(arm64_relocate_new_kernel) /* Setup the list loop variables. */ + mov x18, x2 /* x18 = dtb address */ mov x17, x1 /* x17 = kimage_start */ mov x16, x0 /* x16 = kimage_head */ raw_dcache_line_size x15, x0/* x15 = dcache line size */ @@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel) isb /* Start new image. */ - mov x0, xzr + mov x0, x18 mov x1, xzr mov x2, xzr mov x3, xzr -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 01/11] asm-generic: add kexec_file_load system call to unistd.h
The initial user of this system call number is arm64. Signed-off-by: AKASHI TakahiroAcked-by: Arnd Bergmann --- include/uapi/asm-generic/unistd.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 8bcb186c6f67..745bad1d8269 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -732,9 +732,11 @@ __SYSCALL(__NR_pkey_alloc,sys_pkey_alloc) __SYSCALL(__NR_pkey_free, sys_pkey_free) #define __NR_statx 291 __SYSCALL(__NR_statx, sys_statx) +#define __NR_kexec_file_load 292 +__SYSCALL(__NR_kexec_file_load, sys_kexec_file_load) #undef __NR_syscalls -#define __NR_syscalls 292 +#define __NR_syscalls 293 /* * 32 bit systems traditionally used different -- 2.17.0 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v9 00/11] arm64: kexec: add kexec_file_load() support
This is the ninth round of implementing kexec_file_load() support on arm64.[1] Most of the code is based on kexec-tools. This patch series enables us to * load the kernel by specifying its file descriptor, instead of user- filled buffer, at kexec_file_load() system call, and * optionally verify its signature at load time for trusted boot. Kernel virtual address randomization is also supported since v9. Contrary to kexec_load() system call, as we discussed a long time ago, users may not be allowed to provide a device tree to the 2nd kernel explicitly, hence enforcing a dt blob of the first kernel to be re-used internally. To use kexec_file_load() system call, instead of kexec_load(), at kexec command, '-s' option must be specified. See [2] for a necessary patch for kexec-tools. To analyze a generated crash dump file, use the latest master branch of crash utility[3]. I always try to submit patches to fix any inconsistencies introduced in the latest kernel. Regarding a kernel image verification, a signature must be presented along with the binary itself. A signature is basically a hash value calculated against the whole binary data and encrypted by a key which will be authenticated by one of the system's trusted certificates. Any attempt to read and load a to-be-kexec-ed kernel image through a system call will be checked and blocked if the binary's hash value doesn't match its associated signature. There are two methods available now: 1. implementing arch-specific verification hook of kexec_file_load() 2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework Before my v7, I believed that my patch only supports (1) but am now confident that (2) comes free if IMA is enabled and properly configured. (1) Arch-specific verification hook If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch- defined (and hence file-format-specific) hook function to check for the validity of kernel binary. On x86, a signature is embedded into a PE file (Microsoft's format) header of binary. Since arm64's "Image" can also be seen as a PE file as far as CONFIG_EFI is enabled, we adopt this format for kernel signing. As in the case of UEFI applications, we can create a signed kernel image: $ sbsign --key ${KEY} --cert ${CERT} Image You may want to use certs/signing_key.pem, which is intended to be used for module signing (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test purpose. (2) IMA appraisal-based IMA was first introduced in linux in order to meet TCG (Trusted Computing Group) requirement that all the sensitive files be *measured* before reading/executing them to detect any untrusted changes/modification. Then appraisal feature, which allows us to ensure the integrity of files and even prevent them from reading/executing, was added later. Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to enable IMA-appraisal type verification by the commit b804defe4297 ("kexec: replace call to copy_file_from_fd() with kernel version"). In this scheme, a signature will be stored in a extended file attribute, "security.ima" while a decryption key is hold in a dedicated keyring, ".ima" or "_ima". All the necessary process of verification is confined in a secure API, kernel_read_file_from_fd(), called by kexec_file_load(). Please note that powerpc is one of the two architectures now supporting KEXEC_FILE, and that it wishes to exntend IMA, where a signature may be appended to "vmlinux" file[5], like module signing, instead of using an extended file attribute. While IMA meant to be used with TPM (Trusted Platform Module) on secure platform, IMA is still usable without TPM. Here is an example procedure about how we can give it a try to run the feature using a self-signed root ca for demo/test purposes: 1) Generate needed keys and certificates, following "Generate trusted keys" section in README of ima-evm-utils[6]. 2) Build the kernel with the following kernel configurations, specifying "ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS: CONFIG_EXT4_FS_SECURITY CONFIG_INTEGRITY_SIGNATURE CONFIG_INTEGRITY_ASYMMETRIC_KEYS CONFIG_INTEGRITY_TRUSTED_KEYRING CONFIG_IMA CONFIG_IMA_WRITE_POLICY CONFIG_IMA_READ_POLICY CONFIG_IMA_APPRAISE CONFIG_IMA_APPRAISE_BOOTPARAM CONFIG_SYSTEM_TRUSTED_KEYS Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should not be, enabled. 3) Sign(label) a kernel image binary to be kexec-ed on target filesystem: $ evmctl ima_sign --key /path/to/private_key.pem /your/Image 4) Add a command line parameter and boot the kernel: ima_appraise=enforce On live system, 5) Set a security policy: $ mount -t securityfs none /sys/kernel/security $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \ > /sys/kernel/security/ima/policy 6) Add a key for ima: $ keyctl padd asymmetric