Re: Firmware signing -- Re: [PATCH 00/27] security, efi: Add kernel lockdown
On Thu, Nov 09, 2017 at 10:48:43AM +0900, AKASHI, Takahiro wrote: > On Wed, Nov 08, 2017 at 08:46:26PM +0100, Luis R. Rodriguez wrote: > > But perhaps I'm not understanding the issue well, let me know. > > My point is quite simple: > my_deviceA_init() { > err = request_firmware(, "deviceA"); <--- (a) > if (err) > goto err_request; > > err = verify_firmware(fw); <--- (b) > if (err) > goto err_verify; > > load_fw_to_deviceA(fw); <--- (c) > ... > } > > As legacy device drivers does not have (b), there is no chance to > prevent loading a firmware at (c) for locked-down kernel. Ah, I think your example requires another piece of code to make it clearer. Here is an example legacy driver: my_legacy_deviceB_init() { err = request_firmware(, "deviceB"); <--- (a) if (err) goto err_request; load_fw_to_deviceA(fw); <--- (c) ... } There is no verify_firmware() call here, and as such the approach Linus suggested a while ago cannot possibly fail on a "locked down kernel", unless *very* legacy API call gets a verify_firmware() sprinkled. One sensible thing to say here is then that all request_firmware() calls should just fail on a "locked down kernel", however if this were true then even calls which *did* issue a subsequent verify_firmware() would fail earlier therefore making verify_firmware() pointless on new drivers. Is that what you meant? A long time ago we discussed similar default mechanisms, so its worth re-iterating conclusions done before. The only difference with our current discussion is that the latest proposed mechanism by Linus for firmware signing was to consider a verify_firmware() so we can upkeep a functional API approach to adding support for firmware signing. What you are asking is what are default mechanism should be, and how do we *ensure* this works using the functional API approach. We had ironed out what the default mechanism should have been a while ago, not so much the later as the firmware API has been in a state of flux. I'll provide pointers to discussions around the default policy so that none of this is lost and so that how we end up doing things is well understood later. We can try to also hash out how to make this work then. David Woodhouse has long recommended that a 'defult key' should only be last resort [0], on that same email he also had suggested that since we could end up supporting different signing schemes it may be best to support cryptographic requirements (cert identifier or hash) from the start. James Bottomley has recommended that the default behaviour can only be left up to "system policy". A "system policy" is exactly what "kernel lockdown" is. David's email contained a man page which stipulated that the policy that this type of system would follow *is* to reject unsigned firmware. He and others may want to review if they indeed wanted to follow this path in lieu of this email. Even if one is not using a "kernel lockdown" system policy, we can consider the old schemes discussed for firmware singing. In 2015 I first proposed a simple signing scheme firmware signing which mimics the Linux module signing scheme [2]. The issues with "purpose of a key" was well hashed out (a firmware key should only work for firmware and only for the firmware it was designed for, etc) and later iterations considered this thanks to Andy. In this scheme, the default system policy depended on what kernel configuration option your system had been built with, and matching the module signing scheme we had: o CONFIG_FIRMWARE_SIG - "Firmware signature verification" o CONFIG_FIRMWARE_SIG_FORCE - "Require all firmware to be validly signed" Everything about signing was done automatically behind the schemes for us, just as with module signing. If CONFIG_FIRMWARE_SIG_FORCE was *not* enabled, we'd grant unsigned firmwares to be loaded, we'd just pr_warn() about them. Note that contrary to module signing, firmware signing could not taint for technical reasons I listed on my follow up patches after New Mexico Linux Plumbers [3]. Its worth re-iterating so its not lost. add_taint_module() is currently not exported, that can be fixed easily, but also, the old firmware API currently does not associate the caller module in the code, so we have a few options as to how to taint a kernel with unsigned firmware, which I listed but I'll list here again: 1) Extend the firmware_class driver API to require the correct module to always be set. 2) Use add_taint_module() with the firmware_class module for the two synchrounous APIs, while using the right module for the async call. 3) Use the firmware_class module for all 3 API calls when using add_taint_module() 4) Skip tainting the kernel for unsigned firmware files I went with 1) and 4) on that iteration of patches, with 1) only being done for newer API calls. After this
[PATCH 00/30] security, efi: Add kernel lockdown
Here's a set of patches to institute a "locked-down mode" in the kernel and to trigger that mode if the kernel is booted in secure-boot mode or through the command line. Enabling CONFIG_LOCK_DOWN_KERNEL makes lockdown mode available. Enabling CONFIG_ALLOW_LOCKDOWN_LIFT_BY_SYSRQ will allow a SysRq combination to lift the lockdown. On x86 this is SysRq+x. The keys must be pressed on an attached keyboard. Enabling CONFIG_LOCK_DOWN_IN_EFI_SECURE_BOOT will cause EFI secure boot to trigger kernel lockdown. Inside the kernel, kernel_is_locked_down() is used to check if the kernel is in lockdown mode. Note that the secure boot mode entry doesn't work if the kernel is booted from older versions of i386/x86_64 Grub as there's a bug in Grub whereby it doesn't initialise the boot_params correctly. The incorrect initialisation causes sanitize_boot_params() to be triggered, thereby zapping the secure boot flag determined by the EFI boot wrapper. A manual page, kernel_lockdown.7, is proposed, to which people will be directed by messages in dmesg. This lists the features that are restricted amongst other things. [Note: I need to update this to mention IMA, so I'll reply with that later]. Changes: (*) Made /dev/mem and /dev/kmem explicitly unopenable in lockdown mode, rather than being unopenable as a side effect of /dev/port being made unopenable. (*) Added lockdowns for ftrace and kprobes. (*) Made the bpf lockdown prohibit the use of sys_bpf entirely. (*) Made IMA require secure_boot rules in lockdown mode. (*) Made module signing and kexec allow unsigned images if IMA has been used to validate the image. The patches can be found here also: http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=efi-lock-down David --- Chun-Yi Lee (1): kexec_file: Restrict at runtime if the kernel is locked down Dave Young (1): Copy secure_boot flag in boot params across kexec reboot David Howells (14): Add the ability to lock down access to the running kernel image Enforce module signatures if the kernel is locked down scsi: Lock down the eata driver Prohibit PCMCIA CIS storage when the kernel is locked down Lock down TIOCSSERIAL Lock down module params that specify hardware parameters (eg. ioport) x86/mmiotrace: Lock down the testmmiotrace module debugfs: Disallow use of debugfs files when the kernel is locked down Lock down /proc/kcore Lock down ftrace Lock down kprobes bpf: Restrict kernel image access functions when the kernel is locked down efi: Add an EFI_SECURE_BOOT flag to indicate secure boot mode efi: Lock down the kernel if booted in secure boot mode Josh Boyer (2): hibernate: Disable when the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down Kyle McMartin (1): Add a SysRq option to lift kernel lockdown Linn Crosetto (2): acpi: Disable ACPI table override if the kernel is locked down acpi: Disable APEI error injection if the kernel is locked down Matthew Garrett (8): Restrict /dev/{mem,kmem,port} when the kernel is locked down kexec: Disable at runtime if the kernel is locked down uswsusp: Disable when the kernel is locked down PCI: Lock down BAR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down asus-wmi: Restrict debugfs interface when the kernel is locked down ACPI: Limit access to custom_method when the kernel is locked down Mimi Zohar (1): ima: require secure_boot rules in lockdown mode arch/x86/include/asm/setup.h|2 + arch/x86/kernel/ioport.c|6 +- arch/x86/kernel/kexec-bzimage64.c |1 arch/x86/kernel/msr.c | 10 +++ arch/x86/kernel/setup.c | 18 +- arch/x86/mm/testmmiotrace.c |3 + drivers/acpi/apei/einj.c|3 + drivers/acpi/custom_method.c|3 + drivers/acpi/osl.c |2 - drivers/acpi/tables.c |5 ++ drivers/char/mem.c |2 + drivers/firmware/efi/Makefile |1 drivers/firmware/efi/secureboot.c | 38 drivers/input/misc/uinput.c |1 drivers/pci/pci-sysfs.c |9 +++ drivers/pci/proc.c |9 +++ drivers/pci/syscall.c |3 + drivers/pcmcia/cistpl.c |3 + drivers/platform/x86/asus-wmi.c |9 +++ drivers/scsi/eata.c |5 +- drivers/tty/serial/serial_core.c|6 ++ drivers/tty/sysrq.c | 19 -- fs/debugfs/file.c |6 ++ fs/proc/kcore.c |2 + include/linux/efi.h | 16 +++-- include/linux/input.h |5 ++
[PATCH 01/30] Add the ability to lock down access to the running kernel image
Provide a single call to allow kernel code to determine whether the system should be locked down, thereby disallowing various accesses that might allow the running kernel image to be changed including the loading of modules that aren't validly signed with a key we recognise, fiddling with MSR registers and disallowing hibernation, Signed-off-by: David HowellsAcked-by: James Morris --- include/linux/kernel.h | 17 + include/linux/security.h |8 ++ security/Kconfig |8 ++ security/Makefile|3 ++ security/lock_down.c | 60 ++ 5 files changed, 96 insertions(+) create mode 100644 security/lock_down.c diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 0ad4c3044cf9..362da2e4bf53 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -287,6 +287,23 @@ static inline void refcount_error_report(struct pt_regs *regs, const char *err) { } #endif +#ifdef CONFIG_LOCK_DOWN_KERNEL +extern bool __kernel_is_locked_down(const char *what, bool first); +#else +static inline bool __kernel_is_locked_down(const char *what, bool first) +{ + return false; +} +#endif + +#define kernel_is_locked_down(what)\ + ({ \ + static bool message_given; \ + bool locked_down = __kernel_is_locked_down(what, !message_given); \ + message_given = true; \ + locked_down;\ + }) + /* Internal, do not use. */ int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res); int __must_check _kstrtol(const char *s, unsigned int base, long *res); diff --git a/include/linux/security.h b/include/linux/security.h index ce6265960d6c..310775476b68 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1753,5 +1753,13 @@ static inline void free_secdata(void *secdata) { } #endif /* CONFIG_SECURITY */ +#ifdef CONFIG_LOCK_DOWN_KERNEL +extern void __init init_lockdown(void); +#else +static inline void __init init_lockdown(void) +{ +} +#endif + #endif /* ! __LINUX_SECURITY_H */ diff --git a/security/Kconfig b/security/Kconfig index e8e449444e65..8e01fd59ae7e 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -205,6 +205,14 @@ config STATIC_USERMODEHELPER_PATH If you wish for all usermode helper programs to be disabled, specify an empty string here (i.e. ""). +config LOCK_DOWN_KERNEL + bool "Allow the kernel to be 'locked down'" + help + Allow the kernel to be locked down under certain circumstances, for + instance if UEFI secure boot is enabled. Locking down the kernel + turns off various features that might otherwise allow access to the + kernel image (eg. setting MSR registers). + source security/selinux/Kconfig source security/smack/Kconfig source security/tomoyo/Kconfig diff --git a/security/Makefile b/security/Makefile index f2d71cdb8e19..8c4a43e3d4e0 100644 --- a/security/Makefile +++ b/security/Makefile @@ -29,3 +29,6 @@ obj-$(CONFIG_CGROUP_DEVICE) += device_cgroup.o # Object integrity file lists subdir-$(CONFIG_INTEGRITY) += integrity obj-$(CONFIG_INTEGRITY)+= integrity/ + +# Allow the kernel to be locked down +obj-$(CONFIG_LOCK_DOWN_KERNEL) += lock_down.o diff --git a/security/lock_down.c b/security/lock_down.c new file mode 100644 index ..d8595c0e6673 --- /dev/null +++ b/security/lock_down.c @@ -0,0 +1,60 @@ +/* Lock down the kernel + * + * Copyright (C) 2016 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowe...@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include + +static __ro_after_init bool kernel_locked_down; + +/* + * Put the kernel into lock-down mode. + */ +static void __init lock_kernel_down(const char *where) +{ + if (!kernel_locked_down) { + kernel_locked_down = true; + pr_notice("Kernel is locked down from %s; see man kernel_lockdown.7\n", + where); + } +} + +static int __init lockdown_param(char *ignored) +{ + lock_kernel_down("command line"); + return 0; +} + +early_param("lockdown", lockdown_param); + +/* + * Lock the kernel down from very early in the arch setup. This must happen + * prior to things like ACPI being initialised. + */ +void __init init_lockdown(void) +{ +#ifdef CONFIG_LOCK_DOWN_IN_EFI_SECURE_BOOT + if (efi_enabled(EFI_SECURE_BOOT)) +
[PATCH 03/30] ima: require secure_boot rules in lockdown mode
From: Mimi ZoharRequire the "secure_boot" rules, whether or not it is specified on the boot command line, for both the builtin and custom policies in secure boot lockdown mode. Signed-off-by: Mimi Zohar Signed-off-by: David Howells --- security/integrity/ima/ima_policy.c | 39 ++- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c index 95209a5f8595..49c75e2a1ec5 100644 --- a/security/integrity/ima/ima_policy.c +++ b/security/integrity/ima/ima_policy.c @@ -427,14 +427,21 @@ void ima_update_policy_flag(void) */ void __init ima_init_policy(void) { - int i, measure_entries, appraise_entries, secure_boot_entries; + int i; + int measure_entries = 0; + int appraise_entries = 0; + int secure_boot_entries = 0; + bool kernel_locked_down = __kernel_is_locked_down(NULL, false); /* if !ima_policy set entries = 0 so we load NO default rules */ - measure_entries = ima_policy ? ARRAY_SIZE(dont_measure_rules) : 0; - appraise_entries = ima_use_appraise_tcb ? -ARRAY_SIZE(default_appraise_rules) : 0; - secure_boot_entries = ima_use_secure_boot ? - ARRAY_SIZE(secure_boot_rules) : 0; + if (ima_policy) + measure_entries = ARRAY_SIZE(dont_measure_rules); + + if (ima_use_appraise_tcb) + appraise_entries = ARRAY_SIZE(default_appraise_rules); + + if (ima_use_secure_boot || kernel_locked_down) + secure_boot_entries = ARRAY_SIZE(secure_boot_rules); for (i = 0; i < measure_entries; i++) list_add_tail(_measure_rules[i].list, _default_rules); @@ -455,11 +462,23 @@ void __init ima_init_policy(void) /* * Insert the appraise rules requiring file signatures, prior to -* any other appraise rules. +* any other appraise rules. In secure boot lock-down mode, also +* require these appraise rules for custom policies. */ - for (i = 0; i < secure_boot_entries; i++) - list_add_tail(_boot_rules[i].list, - _default_rules); + for (i = 0; i < secure_boot_entries; i++) { + struct ima_rule_entry *entry; + + /* Include for builtin policies */ + list_add_tail(_boot_rules[i].list, _default_rules); + + /* Include for custom policies */ + if (kernel_locked_down) { + entry = kmemdup(_boot_rules[i], sizeof(*entry), + GFP_KERNEL); + if (entry) + list_add_tail(>list, _policy_rules); + } + } for (i = 0; i < appraise_entries; i++) { list_add_tail(_appraise_rules[i].list, -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/30] kexec: Disable at runtime if the kernel is locked down
From: Matthew Garrettkexec permits the loading and execution of arbitrary code in ring 0, which is something that lock-down is meant to prevent. It makes sense to disable kexec in this situation. This does not affect kexec_file_load() which can check for a signature on the image to be booted. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Acked-by: Dave Young Reviewed-by: "Lee, Chun-Yi" Reviewed-by: James Morris cc: ke...@lists.infradead.org --- kernel/kexec.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index e62ec4dc6620..7dadfed9b676 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -202,6 +202,13 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, return -EPERM; /* +* kexec can be used to circumvent module loading restrictions, so +* prevent loading in that case +*/ + if (kernel_is_locked_down("kexec of unsigned images")) + return -EPERM; + + /* * Verify we have a legal set of flags * This leaves us room for future extensions. */ -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/30] Copy secure_boot flag in boot params across kexec reboot
From: Dave YoungKexec reboot in case secure boot being enabled does not keep the secure boot mode in new kernel, so later one can load unsigned kernel via legacy kexec_load. In this state, the system is missing the protections provided by secure boot. Adding a patch to fix this by retain the secure_boot flag in original kernel. secure_boot flag in boot_params is set in EFI stub, but kexec bypasses the stub. Fixing this issue by copying secure_boot flag across kexec reboot. Signed-off-by: Dave Young Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: ke...@lists.infradead.org --- arch/x86/kernel/kexec-bzimage64.c |1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index fb095ba0c02f..7d0fac5bcbbe 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -179,6 +179,7 @@ setup_efi_state(struct boot_params *params, unsigned long params_load_addr, if (efi_enabled(EFI_OLD_MEMMAP)) return 0; + params->secure_boot = boot_params.secure_boot; ei->efi_loader_signature = current_ei->efi_loader_signature; ei->efi_systab = current_ei->efi_systab; ei->efi_systab_hi = current_ei->efi_systab_hi; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/30] Restrict /dev/{mem, kmem, port} when the kernel is locked down
From: Matthew GarrettAllowing users to read and write to core kernel memory makes it possible for the kernel to be subverted, avoiding module loading restrictions, and also to steal cryptographic information. Disallow /dev/mem and /dev/kmem from being opened this when the kernel has been locked down to prevent this. Also disallow /dev/port from being opened to prevent raw ioport access and thus DMA from being used to accomplish the same thing. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" --- drivers/char/mem.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 593a8818aca9..0ce5ac0a5c6b 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -762,6 +762,8 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig) static int open_port(struct inode *inode, struct file *filp) { + if (kernel_is_locked_down("/dev/mem,kmem,port")) + return -EPERM; return capable(CAP_SYS_RAWIO) ? 0 : -EPERM; } -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/30] PCI: Lock down BAR access when the kernel is locked down
From: Matthew GarrettAny hardware that can potentially generate DMA has to be locked down in order to avoid it being possible for an attacker to modify kernel code, allowing them to circumvent disabled module loading or module signing. Default to paranoid - in future we can potentially relax this for sufficiently IOMMU-isolated devices. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Acked-by: Bjorn Helgaas Reviewed-by: "Lee, Chun-Yi" cc: linux-...@vger.kernel.org --- drivers/pci/pci-sysfs.c |9 + drivers/pci/proc.c |9 - drivers/pci/syscall.c |3 ++- 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 1eecfa301f7f..e1a3b0e765c2 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -881,6 +881,9 @@ static ssize_t pci_write_config(struct file *filp, struct kobject *kobj, loff_t init_off = off; u8 *data = (u8 *) buf; + if (kernel_is_locked_down("Direct PCI access")) + return -EPERM; + if (off > dev->cfg_size) return 0; if (off + count > dev->cfg_size) { @@ -1175,6 +1178,9 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr, enum pci_mmap_state mmap_type; struct resource *res = >resource[bar]; + if (kernel_is_locked_down("Direct PCI access")) + return -EPERM; + if (res->flags & IORESOURCE_MEM && iomem_is_exclusive(res->start)) return -EINVAL; @@ -1255,6 +1261,9 @@ static ssize_t pci_write_resource_io(struct file *filp, struct kobject *kobj, struct bin_attribute *attr, char *buf, loff_t off, size_t count) { + if (kernel_is_locked_down("Direct PCI access")) + return -EPERM; + return pci_resource_io(filp, kobj, attr, buf, off, count, true); } diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c index 098360d7ff81..a6c53d855daa 100644 --- a/drivers/pci/proc.c +++ b/drivers/pci/proc.c @@ -116,6 +116,9 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf, int size = dev->cfg_size; int cnt; + if (kernel_is_locked_down("Direct PCI access")) + return -EPERM; + if (pos >= size) return 0; if (nbytes >= size) @@ -195,6 +198,9 @@ static long proc_bus_pci_ioctl(struct file *file, unsigned int cmd, #endif /* HAVE_PCI_MMAP */ int ret = 0; + if (kernel_is_locked_down("Direct PCI access")) + return -EPERM; + switch (cmd) { case PCIIOC_CONTROLLER: ret = pci_domain_nr(dev->bus); @@ -236,7 +242,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma) struct pci_filp_private *fpriv = file->private_data; int i, ret, write_combine = 0, res_bit = IORESOURCE_MEM; - if (!capable(CAP_SYS_RAWIO)) + if (!capable(CAP_SYS_RAWIO) || + kernel_is_locked_down("Direct PCI access")) return -EPERM; if (fpriv->mmap_state == pci_mmap_io) { diff --git a/drivers/pci/syscall.c b/drivers/pci/syscall.c index 9bf993e1f71e..afa01cc3ceec 100644 --- a/drivers/pci/syscall.c +++ b/drivers/pci/syscall.c @@ -92,7 +92,8 @@ SYSCALL_DEFINE5(pciconfig_write, unsigned long, bus, unsigned long, dfn, u32 dword; int err = 0; - if (!capable(CAP_SYS_ADMIN)) + if (!capable(CAP_SYS_ADMIN) || + kernel_is_locked_down("Direct PCI access")) return -EPERM; dev = pci_get_bus_and_slot(bus, dfn); -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/30] kexec_file: Restrict at runtime if the kernel is locked down
From: Chun-Yi LeeWhen KEXEC_VERIFY_SIG is not enabled, kernel should not load images through kexec_file systemcall if the kernel is locked down unless IMA can be used to validate the image. This code was showed in Matthew's patch but not in git: https://lkml.org/lkml/2015/3/13/778 Cc: Matthew Garrett Signed-off-by: Chun-Yi Lee Signed-off-by: David Howells Reviewed-by: James Morris cc: ke...@lists.infradead.org --- kernel/kexec_file.c |8 1 file changed, 8 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 9f48f4412297..3ba28fc3fab0 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -255,6 +255,14 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, if (!capable(CAP_SYS_BOOT) || kexec_load_disabled) return -EPERM; + /* Don't permit images to be loaded into trusted kernels if we're not +* going to verify the signature on them +*/ + if (!IS_ENABLED(CONFIG_KEXEC_VERIFY_SIG) && + !is_ima_appraise_enabled() && + kernel_is_locked_down("kexec of unsigned images")) + return -EPERM; + /* Make sure we have a legal set of flags */ if (flags != (flags & KEXEC_FILE_FLAGS)) return -EINVAL; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/30] uswsusp: Disable when the kernel is locked down
From: Matthew Garrettuswsusp allows a user process to dump and then restore kernel state, which makes it possible to modify the running kernel. Disable this if the kernel is locked down. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" Reviewed-by: James Morris cc: linux...@vger.kernel.org --- kernel/power/user.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/power/user.c b/kernel/power/user.c index 22df9f7ff672..678ade9decfe 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -52,6 +52,9 @@ static int snapshot_open(struct inode *inode, struct file *filp) if (!hibernation_available()) return -EPERM; + if (kernel_is_locked_down("/dev/snapshot")) + return -EPERM; + lock_system_sleep(); if (!atomic_add_unless(_device_available, -1, 0)) { -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/30] x86: Lock down IO port access when the kernel is locked down
From: Matthew GarrettIO port access would permit users to gain access to PCI configuration registers, which in turn (on a lot of hardware) give access to MMIO register space. This would potentially permit root to trigger arbitrary DMA, so lock it down by default. This also implicitly locks down the KDADDIO, KDDELIO, KDENABIO and KDDISABIO console ioctls. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Reviewed-by: Thomas Gleixner Reviewed-by: "Lee, Chun-Yi" cc: x...@kernel.org --- arch/x86/kernel/ioport.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c index 9c3cf0944bce..2c0f058651c5 100644 --- a/arch/x86/kernel/ioport.c +++ b/arch/x86/kernel/ioport.c @@ -30,7 +30,8 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on) if ((from + num <= from) || (from + num > IO_BITMAP_BITS)) return -EINVAL; - if (turn_on && !capable(CAP_SYS_RAWIO)) + if (turn_on && (!capable(CAP_SYS_RAWIO) || + kernel_is_locked_down("ioperm"))) return -EPERM; /* @@ -120,7 +121,8 @@ SYSCALL_DEFINE1(iopl, unsigned int, level) return -EINVAL; /* Trying to gain more privileges? */ if (level > old) { - if (!capable(CAP_SYS_RAWIO)) + if (!capable(CAP_SYS_RAWIO) || + kernel_is_locked_down("iopl")) return -EPERM; } regs->flags = (regs->flags & ~X86_EFLAGS_IOPL) | -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/30] asus-wmi: Restrict debugfs interface when the kernel is locked down
From: Matthew GarrettWe have no way of validating what all of the Asus WMI methods do on a given machine - and there's a risk that some will allow hardware state to be manipulated in such a way that arbitrary code can be executed in the kernel, circumventing module loading restrictions. Prevent that if the kernel is locked down. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: acpi4asus-u...@lists.sourceforge.net cc: platform-driver-...@vger.kernel.org --- drivers/platform/x86/asus-wmi.c |9 + 1 file changed, 9 insertions(+) diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c index 48e1541dc8d4..ef5587469337 100644 --- a/drivers/platform/x86/asus-wmi.c +++ b/drivers/platform/x86/asus-wmi.c @@ -1905,6 +1905,9 @@ static int show_dsts(struct seq_file *m, void *data) int err; u32 retval = -1; + if (kernel_is_locked_down("Asus WMI")) + return -EPERM; + err = asus_wmi_get_devstate(asus, asus->debug.dev_id, ); if (err < 0) @@ -1921,6 +1924,9 @@ static int show_devs(struct seq_file *m, void *data) int err; u32 retval = -1; + if (kernel_is_locked_down("Asus WMI")) + return -EPERM; + err = asus_wmi_set_devstate(asus->debug.dev_id, asus->debug.ctrl_param, ); @@ -1945,6 +1951,9 @@ static int show_call(struct seq_file *m, void *data) union acpi_object *obj; acpi_status status; + if (kernel_is_locked_down("Asus WMI")) + return -EPERM; + status = wmi_evaluate_method(ASUS_WMI_MGMT_GUID, 0, asus->debug.method_id, , ); -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/30] x86/msr: Restrict MSR access when the kernel is locked down
From: Matthew GarrettWriting to MSRs should not be allowed if the kernel is locked down, since it could lead to execution of arbitrary code in kernel mode. Based on a patch by Kees Cook. MSR accesses are logged for the purposes of building up a whitelist as per Alan Cox's suggestion. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Acked-by: Kees Cook Reviewed-by: Thomas Gleixner Reviewed-by: "Lee, Chun-Yi" cc: x...@kernel.org --- arch/x86/kernel/msr.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c index ef688804f80d..dfb61d358196 100644 --- a/arch/x86/kernel/msr.c +++ b/arch/x86/kernel/msr.c @@ -84,6 +84,11 @@ static ssize_t msr_write(struct file *file, const char __user *buf, int err = 0; ssize_t bytes = 0; + if (kernel_is_locked_down("Direct MSR access")) { + pr_info("Direct access to MSR %x\n", reg); + return -EPERM; + } + if (count % 8) return -EINVAL; /* Invalid chunk size */ @@ -135,6 +140,11 @@ static long msr_ioctl(struct file *file, unsigned int ioc, unsigned long arg) err = -EFAULT; break; } + if (kernel_is_locked_down("Direct MSR access")) { + pr_info("Direct access to MSR %x\n", regs[1]); /* Display %ecx */ + err = -EPERM; + break; + } err = wrmsr_safe_regs_on_cpu(cpu, regs); if (err) break; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/30] ACPI: Limit access to custom_method when the kernel is locked down
From: Matthew Garrettcustom_method effectively allows arbitrary access to system memory, making it possible for an attacker to circumvent restrictions on module loading. Disable it if the kernel is locked down. Signed-off-by: Matthew Garrett Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: linux-a...@vger.kernel.org --- drivers/acpi/custom_method.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/acpi/custom_method.c b/drivers/acpi/custom_method.c index c68e72414a67..b33fba70ec51 100644 --- a/drivers/acpi/custom_method.c +++ b/drivers/acpi/custom_method.c @@ -29,6 +29,9 @@ static ssize_t cm_write(struct file *file, const char __user * user_buf, struct acpi_table_header table; acpi_status status; + if (kernel_is_locked_down("ACPI custom methods")) + return -EPERM; + if (!(*ppos)) { /* parse the table header to get the table length */ if (count <= sizeof(struct acpi_table_header)) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/30] acpi: Disable ACPI table override if the kernel is locked down
From: Linn Crosetto>From the kernel documentation (initrd_table_override.txt): If the ACPI_INITRD_TABLE_OVERRIDE compile option is true, it is possible to override nearly any ACPI table provided by the BIOS with an instrumented, modified one. When securelevel is set, the kernel should disallow any unauthenticated changes to kernel space. ACPI tables contain code invoked by the kernel, so do not allow ACPI tables to be overridden if the kernel is locked down. Signed-off-by: Linn Crosetto Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: linux-a...@vger.kernel.org --- drivers/acpi/tables.c |5 + 1 file changed, 5 insertions(+) diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 80ce2a7d224b..5cc13c42daf9 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -526,6 +526,11 @@ void __init acpi_table_upgrade(void) if (table_nr == 0) return; + if (kernel_is_locked_down("ACPI table override")) { + pr_notice("kernel is locked down, ignoring table override\n"); + return; + } + acpi_tables_addr = memblock_find_in_range(0, ACPI_TABLE_UPGRADE_MAX_PHYS, all_tables_size, PAGE_SIZE); -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/30] scsi: Lock down the eata driver
When the kernel is running in secure boot mode, we lock down the kernel to prevent userspace from modifying the running kernel image. Whilst this includes prohibiting access to things like /dev/mem, it must also prevent access by means of configuring driver modules in such a way as to cause a device to access or modify the kernel image. The eata driver takes a single string parameter that contains a slew of settings, including hardware resource configuration. Prohibit use of the parameter if the kernel is locked down. Suggested-by: Alan CoxSigned-off-by: David Howells cc: Dario Ballabio cc: "James E.J. Bottomley" cc: "Martin K. Petersen" cc: linux-s...@vger.kernel.org --- drivers/scsi/eata.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c index 6501c330d8c8..72fceaa8f3da 100644 --- a/drivers/scsi/eata.c +++ b/drivers/scsi/eata.c @@ -1552,8 +1552,11 @@ static int eata2x_detect(struct scsi_host_template *tpnt) tpnt->proc_name = "eata2x"; - if (strlen(boot_options)) + if (strlen(boot_options)) { + if (kernel_is_locked_down("Command line-specified device addresses, irqs and dma channels")) + return -EPERM; option_setup(boot_options); + } #if defined(MODULE) /* io_port could have been modified when loading as a module */ -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/30] acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down
From: Josh BoyerThis option allows userspace to pass the RSDP address to the kernel, which makes it possible for a user to modify the workings of hardware . Reject the option when the kernel is locked down. Signed-off-by: Josh Boyer Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: Dave Young cc: linux-a...@vger.kernel.org --- drivers/acpi/osl.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index db78d353bab1..36c6527c1b0a 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -192,7 +192,7 @@ acpi_physical_address __init acpi_os_get_root_pointer(void) acpi_physical_address pa = 0; #ifdef CONFIG_KEXEC - if (acpi_rsdp) + if (acpi_rsdp && !kernel_is_locked_down("ACPI RSDP specification")) return acpi_rsdp; #endif -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/30] acpi: Disable APEI error injection if the kernel is locked down
From: Linn CrosettoACPI provides an error injection mechanism, EINJ, for debugging and testing the ACPI Platform Error Interface (APEI) and other RAS features. If supported by the firmware, ACPI specification 5.0 and later provide for a way to specify a physical memory address to which to inject the error. Injecting errors through EINJ can produce errors which to the platform are indistinguishable from real hardware errors. This can have undesirable side-effects, such as causing the platform to mark hardware as needing replacement. While it does not provide a method to load unauthenticated privileged code, the effect of these errors may persist across reboots and affect trust in the underlying hardware, so disable error injection through EINJ if the kernel is locked down. Signed-off-by: Linn Crosetto Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: linux-a...@vger.kernel.org --- drivers/acpi/apei/einj.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index b38737c83a24..6d71e1e97b20 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -518,6 +518,9 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, int rc; u64 base_addr, size; + if (kernel_is_locked_down("ACPI error injection")) + return -EPERM; + /* If user manually set "flags", make sure it is legal */ if (flags && (flags & ~(SETWA_FLAGS_APICID|SETWA_FLAGS_MEM|SETWA_FLAGS_PCIE_SBDF))) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 20/30] Prohibit PCMCIA CIS storage when the kernel is locked down
Prohibit replacement of the PCMCIA Card Information Structure when the kernel is locked down. Suggested-by: Dominik BrodowskiSigned-off-by: David Howells cc: linux-pcm...@lists.infradead.org --- drivers/pcmcia/cistpl.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pcmcia/cistpl.c b/drivers/pcmcia/cistpl.c index 55ef7d1fd8da..b7a0e42eeb25 100644 --- a/drivers/pcmcia/cistpl.c +++ b/drivers/pcmcia/cistpl.c @@ -1578,6 +1578,9 @@ static ssize_t pccard_store_cis(struct file *filp, struct kobject *kobj, struct pcmcia_socket *s; int error; + if (kernel_is_locked_down("Direct PCMCIA CIS storage")) + return -EPERM; + s = to_socket(container_of(kobj, struct device, kobj)); if (off) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/30] Lock down TIOCSSERIAL
Lock down TIOCSSERIAL as that can be used to change the ioport and irq settings on a serial port. This only appears to be an issue for the serial drivers that use the core serial code. All other drivers seem to either ignore attempts to change port/irq or give an error. Reported-by: Greg Kroah-HartmanSigned-off-by: David Howells cc: Jiri Slaby --- drivers/tty/serial/serial_core.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 3a14cccbd7ff..41f0922ad842 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -842,6 +842,12 @@ static int uart_set_info(struct tty_struct *tty, struct tty_port *port, new_flags = (__force upf_t)new_info->flags; old_custom_divisor = uport->custom_divisor; + if ((change_port || change_irq) && + kernel_is_locked_down("Using TIOCSSERIAL to change device addresses, irqs and dma channels")) { + retval = -EPERM; + goto exit; + } + if (!capable(CAP_SYS_ADMIN)) { retval = -EPERM; if (change_irq || change_port || -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 23/30] x86/mmiotrace: Lock down the testmmiotrace module
The testmmiotrace module shouldn't be permitted when the kernel is locked down as it can be used to arbitrarily read and write MMIO space. Suggested-by: Thomas GleixnerSigned-off-by: David Howells cc: Steven Rostedt cc: Ingo Molnar cc: "H. Peter Anvin" cc: x...@kernel.org --- arch/x86/mm/testmmiotrace.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/mm/testmmiotrace.c b/arch/x86/mm/testmmiotrace.c index f6ae6830b341..bbaad357f5d7 100644 --- a/arch/x86/mm/testmmiotrace.c +++ b/arch/x86/mm/testmmiotrace.c @@ -115,6 +115,9 @@ static int __init init(void) { unsigned long size = (read_far) ? (8 << 20) : (16 << 10); + if (kernel_is_locked_down("MMIO trace testing")) + return -EPERM; + if (mmio_address == 0) { pr_err("you have to use the module argument mmio_address.\n"); pr_err("DO NOT LOAD THIS MODULE UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!\n"); -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 24/30] debugfs: Disallow use of debugfs files when the kernel is locked down
Disallow opening of debugfs files when the kernel is locked down as various drivers give raw access to hardware through debugfs. Accesses to tracefs should use /sys/kernel/tracing/ rather than /sys/kernel/debug/tracing/. Possibly a symlink should be emplaced. Normal device interaction should be done through configfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. Signed-off-by: David Howellscc: Andy Shevchenko cc: acpi4asus-u...@lists.sourceforge.net cc: platform-driver-...@vger.kernel.org cc: Matthew Garrett cc: Thomas Gleixner --- fs/debugfs/file.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index 6dabc4a10396..32b5168a7e91 100644 --- a/fs/debugfs/file.c +++ b/fs/debugfs/file.c @@ -103,6 +103,9 @@ static int open_proxy_open(struct inode *inode, struct file *filp) const struct file_operations *real_fops = NULL; int srcu_idx, r; + if (kernel_is_locked_down("debugfs")) + return -EPERM; + r = debugfs_use_file_start(dentry, _idx); if (r) { r = -ENOENT; @@ -232,6 +235,9 @@ static int full_proxy_open(struct inode *inode, struct file *filp) struct file_operations *proxy_fops = NULL; int srcu_idx, r; + if (kernel_is_locked_down("debugfs")) + return -EPERM; + r = debugfs_use_file_start(dentry, _idx); if (r) { r = -ENOENT; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/30] Lock down module params that specify hardware parameters (eg. ioport)
Provided an annotation for module parameters that specify hardware parameters (such as io ports, iomem addresses, irqs, dma channels, fixed dma buffers and other types). Suggested-by: Alan CoxSigned-off-by: David Howells --- kernel/params.c | 26 +- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/kernel/params.c b/kernel/params.c index 60b2d8101355..422979adb60a 100644 --- a/kernel/params.c +++ b/kernel/params.c @@ -108,13 +108,19 @@ bool parameq(const char *a, const char *b) return parameqn(a, b, strlen(a)+1); } -static void param_check_unsafe(const struct kernel_param *kp) +static bool param_check_unsafe(const struct kernel_param *kp, + const char *doing) { if (kp->flags & KERNEL_PARAM_FL_UNSAFE) { pr_warn("Setting dangerous option %s - tainting kernel\n", kp->name); add_taint(TAINT_USER, LOCKDEP_STILL_OK); } + + if (kp->flags & KERNEL_PARAM_FL_HWPARAM && + kernel_is_locked_down("Command line-specified device addresses, irqs and dma channels")) + return false; + return true; } static int parse_one(char *param, @@ -144,8 +150,10 @@ static int parse_one(char *param, pr_debug("handling %s with %p\n", param, params[i].ops->set); kernel_param_lock(params[i].mod); - param_check_unsafe([i]); - err = params[i].ops->set(val, [i]); + if (param_check_unsafe([i], doing)) + err = params[i].ops->set(val, [i]); + else + err = -EPERM; kernel_param_unlock(params[i].mod); return err; } @@ -556,6 +564,12 @@ static ssize_t param_attr_show(struct module_attribute *mattr, return count; } +#ifdef CONFIG_MODULES +#define mod_name(mod) (mod)->name +#else +#define mod_name(mod) "unknown" +#endif + /* sysfs always hands a nul-terminated string in buf. We rely on that. */ static ssize_t param_attr_store(struct module_attribute *mattr, struct module_kobject *mk, @@ -568,8 +582,10 @@ static ssize_t param_attr_store(struct module_attribute *mattr, return -EPERM; kernel_param_lock(mk->mod); - param_check_unsafe(attribute->param); - err = attribute->param->ops->set(buf, attribute->param); + if (param_check_unsafe(attribute->param, mod_name(mk->mod))) + err = attribute->param->ops->set(buf, attribute->param); + else + err = -EPERM; kernel_param_unlock(mk->mod); if (!err) return len; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/30] Lock down /proc/kcore
Disallow access to /proc/kcore when the kernel is locked down to prevent access to cryptographic data. Signed-off-by: David HowellsReviewed-by: James Morris --- fs/proc/kcore.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 45629f4b5402..176cf749e650 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -549,6 +549,8 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) static int open_kcore(struct inode *inode, struct file *filp) { + if (kernel_is_locked_down("/proc/kcore")) + return -EPERM; if (!capable(CAP_SYS_RAWIO)) return -EPERM; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 26/30] Lock down ftrace
Disallow the use of ftrace when the kernel is locked down. This patch turns off ftrace_enabled late in the kernel boot so that the selftest can still be potentially be run. The sysctl that controls ftrace_enables is also disallowed when the kernel is locked down. If the lockdown is lifted, then the sysctl can be used to reenable ftrace - if ftrace was compiled with CONFIG_DYNAMIC_FTRACE, that is; if it wasn't then it won't be possible to reenable it. This prevents crypto data theft by analysis of execution patterns, and, if in future ftrace also logs the register contents at the time, will prevent data theft by that mechanism also. Reported-by: Alexei StarovoitovSigned-off-by: David Howells --- kernel/trace/ftrace.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 6abfafd7f173..9c7135963d80 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -6508,6 +6508,9 @@ ftrace_enable_sysctl(struct ctl_table *table, int write, { int ret = -ENODEV; + if (kernel_is_locked_down("Use of ftrace")) + return -EPERM; + mutex_lock(_lock); if (unlikely(ftrace_disabled)) @@ -6896,3 +6899,22 @@ void ftrace_graph_exit_task(struct task_struct *t) kfree(ret_stack); } #endif + +#ifdef CONFIG_LOCK_DOWN_KERNEL +static int __init ftrace_lock_down(void) +{ + mutex_lock(_lock); + + if (!ftrace_disabled && ftrace_enabled && + kernel_is_locked_down("Use of ftrace")) { + ftrace_enabled = false; + last_ftrace_enabled = false; + ftrace_trace_function = ftrace_stub; + ftrace_shutdown_sysctl(); + } + + mutex_unlock(_lock); + return 0; +} +late_initcall(ftrace_lock_down); +#endif -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 28/30] bpf: Restrict kernel image access functions when the kernel is locked down
There are some bpf functions can be used to read kernel memory: bpf_probe_read, bpf_probe_write_user and bpf_trace_printk. These allow private keys in kernel memory (e.g. the hibernation image signing key) to be read by an eBPF program and kernel memory to be altered without restriction. Completely prohibit the use of BPF when the kernel is locked down. Suggested-by: Alexei StarovoitovSigned-off-by: David Howells cc: net...@vger.kernel.org cc: Chun-Yi Lee cc: Alexei Starovoitov --- kernel/bpf/syscall.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 25d074920a00..fa58ad74cde6 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1458,6 +1458,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz if (!capable(CAP_SYS_ADMIN) && sysctl_unprivileged_bpf_disabled) return -EPERM; + if (kernel_is_locked_down("BPF")) + return -EPERM; + err = check_uarg_tail_zero(uattr, sizeof(attr), size); if (err) return err; -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 27/30] Lock down kprobes
Disallow the creation of kprobes when the kernel is locked down by preventing their registration. This prevents kprobes from being used to access kernel memory, either to make modifications or to steal crypto data. Reported-by: Alexei StarovoitovSigned-off-by: David Howells --- kernel/kprobes.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index a1606a4224e1..f06023b0936c 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1530,6 +1530,9 @@ int register_kprobe(struct kprobe *p) struct module *probed_mod; kprobe_opcode_t *addr; + if (kernel_is_locked_down("Use of kprobes")) + return -EPERM; + /* Adjust probe address from symbol */ addr = kprobe_addr(p); if (IS_ERR(addr)) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 29/30] efi: Add an EFI_SECURE_BOOT flag to indicate secure boot mode
UEFI machines can be booted in Secure Boot mode. Add an EFI_SECURE_BOOT flag that can be passed to efi_enabled() to find out whether secure boot is enabled. Move the switch-statement in x86's setup_arch() that inteprets the secure_boot boot parameter to generic code and set the bit there. Suggested-by: Ard BiesheuvelSigned-off-by: David Howells Reviewed-by: Ard Biesheuvel cc: linux-efi@vger.kernel.org --- arch/x86/kernel/setup.c | 14 +- drivers/firmware/efi/Makefile |1 + drivers/firmware/efi/secureboot.c | 38 + include/linux/efi.h | 16 ++-- 4 files changed, 50 insertions(+), 19 deletions(-) create mode 100644 drivers/firmware/efi/secureboot.c diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 0957dd73d127..7c2162f9e769 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1197,19 +1197,7 @@ void __init setup_arch(char **cmdline_p) /* Allocate bigger log buffer */ setup_log_buf(1); - if (efi_enabled(EFI_BOOT)) { - switch (boot_params.secure_boot) { - case efi_secureboot_mode_disabled: - pr_info("Secure boot disabled\n"); - break; - case efi_secureboot_mode_enabled: - pr_info("Secure boot enabled\n"); - break; - default: - pr_info("Secure boot could not be determined\n"); - break; - } - } + efi_set_secure_boot(boot_params.secure_boot); reserve_initrd(); diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile index 0329d319d89a..883f9f7eefc6 100644 --- a/drivers/firmware/efi/Makefile +++ b/drivers/firmware/efi/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_EFI_FAKE_MEMMAP) += fake_mem.o obj-$(CONFIG_EFI_BOOTLOADER_CONTROL) += efibc.o obj-$(CONFIG_EFI_TEST) += test/ obj-$(CONFIG_EFI_DEV_PATH_PARSER) += dev-path-parser.o +obj-$(CONFIG_EFI) += secureboot.o obj-$(CONFIG_APPLE_PROPERTIES) += apple-properties.o arm-obj-$(CONFIG_EFI) := arm-init.o arm-runtime.o diff --git a/drivers/firmware/efi/secureboot.c b/drivers/firmware/efi/secureboot.c new file mode 100644 index ..9070055de0a1 --- /dev/null +++ b/drivers/firmware/efi/secureboot.c @@ -0,0 +1,38 @@ +/* Core kernel secure boot support. + * + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowe...@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include + +/* + * Decide what to do when UEFI secure boot mode is enabled. + */ +void __init efi_set_secure_boot(enum efi_secureboot_mode mode) +{ + if (efi_enabled(EFI_BOOT)) { + switch (mode) { + case efi_secureboot_mode_disabled: + pr_info("Secure boot disabled\n"); + break; + case efi_secureboot_mode_enabled: + set_bit(EFI_SECURE_BOOT, ); + pr_info("Secure boot enabled\n"); + break; + default: + pr_warning("Secure boot could not be determined (mode %u)\n", + mode); + break; + } + } +} diff --git a/include/linux/efi.h b/include/linux/efi.h index 66f4a4e79f4b..7c7a7e33e4d1 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1103,6 +1103,14 @@ extern int __init efi_setup_pcdp_console(char *); #define EFI_DBG8 /* Print additional debug info at runtime */ #define EFI_NX_PE_DATA 9 /* Can runtime data regions be mapped non-executable? */ #define EFI_MEM_ATTR 10 /* Did firmware publish an EFI_MEMORY_ATTRIBUTES table? */ +#define EFI_SECURE_BOOT11 /* Are we in Secure Boot mode? */ + +enum efi_secureboot_mode { + efi_secureboot_mode_unset, + efi_secureboot_mode_unknown, + efi_secureboot_mode_disabled, + efi_secureboot_mode_enabled, +}; #ifdef CONFIG_EFI /* @@ -1115,6 +1123,7 @@ static inline bool efi_enabled(int feature) extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused); extern bool efi_is_table_address(unsigned long phys_addr); +extern void __init efi_set_secure_boot(enum efi_secureboot_mode mode); #else static inline bool efi_enabled(int feature) { @@ -1133,6 +1142,7 @@ static inline bool
[PATCH 30/30] efi: Lock down the kernel if booted in secure boot mode
UEFI Secure Boot provides a mechanism for ensuring that the firmware will only load signed bootloaders and kernels. Certain use cases may also require that all kernel modules also be signed. Add a configuration option that to lock down the kernel - which includes requiring validly signed modules - if the kernel is secure-booted. Signed-off-by: David HowellsAcked-by: Ard Biesheuvel cc: linux-efi@vger.kernel.org --- arch/x86/kernel/setup.c |6 -- security/Kconfig| 14 ++ security/lock_down.c|1 + 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 7c2162f9e769..4e38327efb2e 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -64,6 +64,7 @@ #include #include #include +#include #include #include @@ -1039,6 +1040,9 @@ void __init setup_arch(char **cmdline_p) if (efi_enabled(EFI_BOOT)) efi_init(); + efi_set_secure_boot(boot_params.secure_boot); + init_lockdown(); + dmi_scan_machine(); dmi_memdev_walk(); dmi_set_dump_stack_arch_desc(); @@ -1197,8 +1201,6 @@ void __init setup_arch(char **cmdline_p) /* Allocate bigger log buffer */ setup_log_buf(1); - efi_set_secure_boot(boot_params.secure_boot); - reserve_initrd(); acpi_table_upgrade(); diff --git a/security/Kconfig b/security/Kconfig index 1e997be94ba2..a4fa8b826039 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -222,6 +222,20 @@ config ALLOW_LOCKDOWN_LIFT_BY_SYSRQ Allow the lockdown on a kernel to be lifted, by pressing a SysRq key combination on a wired keyboard. +config LOCK_DOWN_IN_EFI_SECURE_BOOT + bool "Lock down the kernel in EFI Secure Boot mode" + default n + select LOCK_DOWN_KERNEL + depends on EFI + help + UEFI Secure Boot provides a mechanism for ensuring that the firmware + will only load signed bootloaders and kernels. Secure boot mode may + be determined from EFI variables provided by the system firmware if + not indicated by the boot parameters. + + Enabling this option turns on results in kernel lockdown being + triggered if EFI Secure Boot is set. + source security/selinux/Kconfig source security/smack/Kconfig diff --git a/security/lock_down.c b/security/lock_down.c index 2c6b00f0c229..527f7e51dc8d 100644 --- a/security/lock_down.c +++ b/security/lock_down.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #ifdef CONFIG_ALLOW_LOCKDOWN_LIFT_BY_SYSRQ -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/30] hibernate: Disable when the kernel is locked down
From: Josh BoyerThere is currently no way to verify the resume image when returning from hibernate. This might compromise the signed modules trust model, so until we can work with signed hibernate images we disable it when the kernel is locked down. Signed-off-by: Josh Boyer Signed-off-by: David Howells Reviewed-by: "Lee, Chun-Yi" cc: linux...@vger.kernel.org --- kernel/power/hibernate.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index a5c36e9c56a6..f2eafefeec50 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -70,7 +70,7 @@ static const struct platform_hibernation_ops *hibernation_ops; bool hibernation_available(void) { - return (nohibernate == 0); + return nohibernate == 0 && !kernel_is_locked_down("Hibernation"); } /** -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/30] Enforce module signatures if the kernel is locked down
If the kernel is locked down, require that all modules have valid signatures that we can verify or that IMA can validate the file. Signed-off-by: David HowellsReviewed-by: "Lee, Chun-Yi" Reviewed-by: James Morris --- kernel/module.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index de66ec825992..0ce29c8aa75a 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -64,6 +64,7 @@ #include #include #include +#include #include #include "module-internal.h" @@ -2757,7 +2758,8 @@ static inline void kmemleak_load_module(const struct module *mod, #endif #ifdef CONFIG_MODULE_SIG -static int module_sig_check(struct load_info *info, int flags) +static int module_sig_check(struct load_info *info, int flags, + bool can_do_ima_check) { int err = -ENOKEY; const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; @@ -2781,13 +2783,16 @@ static int module_sig_check(struct load_info *info, int flags) } /* Not having a signature is only an error if we're strict. */ - if (err == -ENOKEY && !sig_enforce) + if (err == -ENOKEY && !sig_enforce && + (!can_do_ima_check || !is_ima_appraise_enabled()) && + !kernel_is_locked_down("Loading of unsigned modules")) err = 0; return err; } #else /* !CONFIG_MODULE_SIG */ -static int module_sig_check(struct load_info *info, int flags) +static int module_sig_check(struct load_info *info, int flags, + bool can_do_ima_check) { return 0; } @@ -3630,13 +3635,13 @@ static int unknown_module_param_cb(char *param, char *val, const char *modname, /* Allocate and load the module: note that size of section 0 is always zero, and we rely on this for optional sections. */ static int load_module(struct load_info *info, const char __user *uargs, - int flags) + int flags, bool can_do_ima_check) { struct module *mod; long err; char *after_dashes; - err = module_sig_check(info, flags); + err = module_sig_check(info, flags, can_do_ima_check); if (err) goto free_copy; @@ -3830,7 +3835,7 @@ SYSCALL_DEFINE3(init_module, void __user *, umod, if (err) return err; - return load_module(, uargs, 0); + return load_module(, uargs, 0, false); } SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags) @@ -3857,7 +3862,7 @@ SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags) info.hdr = hdr; info.len = size; - return load_module(, uargs, flags); + return load_module(, uargs, flags, true); } static inline int within(unsigned long addr, void *start, unsigned long size) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/30] Add a SysRq option to lift kernel lockdown
From: Kyle McMartinMake an option to provide a sysrq key that will lift the kernel lockdown, thereby allowing the running kernel image to be accessed and modified. On x86 this is triggered with SysRq+x, but this key may not be available on all arches, so it is set by setting LOCKDOWN_LIFT_KEY in asm/setup.h. Since this macro must be defined in an arch to be able to use this facility for that arch, the Kconfig option is restricted to arches that support it. Signed-off-by: Kyle McMartin Signed-off-by: David Howells cc: x...@kernel.org --- arch/x86/include/asm/setup.h |2 ++ drivers/input/misc/uinput.c |1 + drivers/tty/sysrq.c | 19 - include/linux/input.h|5 include/linux/sysrq.h|8 ++- kernel/debug/kdb/kdb_main.c |2 +- security/Kconfig | 10 + security/lock_down.c | 47 ++ 8 files changed, 86 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h index a65cf544686a..863f77582c09 100644 --- a/arch/x86/include/asm/setup.h +++ b/arch/x86/include/asm/setup.h @@ -8,6 +8,8 @@ #include #include +#define LOCKDOWN_LIFT_KEY 'x' + #ifdef __i386__ #include diff --git a/drivers/input/misc/uinput.c b/drivers/input/misc/uinput.c index 443151de90c6..45a1f5460805 100644 --- a/drivers/input/misc/uinput.c +++ b/drivers/input/misc/uinput.c @@ -408,6 +408,7 @@ static int uinput_allocate_device(struct uinput_device *udev) if (!udev->dev) return -ENOMEM; + udev->dev->flags |= INPUTDEV_FLAGS_SYNTHETIC; udev->dev->event = uinput_dev_event; input_set_drvdata(udev->dev, udev); diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 3ffc1ce29023..8b766dbad6dd 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -481,6 +481,7 @@ static struct sysrq_key_op *sysrq_key_table[36] = { /* x: May be registered on mips for TLB dump */ /* x: May be registered on ppc/powerpc for xmon */ /* x: May be registered on sparc64 for global PMU dump */ + /* x: May be registered on x86_64 for disabling secure boot */ NULL, /* x */ /* y: May be registered on sparc64 for global register dump */ NULL, /* y */ @@ -524,7 +525,7 @@ static void __sysrq_put_key_op(int key, struct sysrq_key_op *op_p) sysrq_key_table[i] = op_p; } -void __handle_sysrq(int key, bool check_mask) +void __handle_sysrq(int key, unsigned int from) { struct sysrq_key_op *op_p; int orig_log_level; @@ -544,11 +545,15 @@ void __handle_sysrq(int key, bool check_mask) op_p = __sysrq_get_key_op(key); if (op_p) { + /* Ban synthetic events from some sysrq functionality */ + if ((from == SYSRQ_FROM_PROC || from == SYSRQ_FROM_SYNTHETIC) && + op_p->enable_mask & SYSRQ_DISABLE_USERSPACE) + printk("This sysrq operation is disabled from userspace.\n"); /* * Should we check for enabled operations (/proc/sysrq-trigger * should not) and is the invoked operation enabled? */ - if (!check_mask || sysrq_on_mask(op_p->enable_mask)) { + if (from == SYSRQ_FROM_KERNEL || sysrq_on_mask(op_p->enable_mask)) { pr_cont("%s\n", op_p->action_msg); console_loglevel = orig_log_level; op_p->handler(key); @@ -580,7 +585,7 @@ void __handle_sysrq(int key, bool check_mask) void handle_sysrq(int key) { if (sysrq_on()) - __handle_sysrq(key, true); + __handle_sysrq(key, SYSRQ_FROM_KERNEL); } EXPORT_SYMBOL(handle_sysrq); @@ -661,7 +666,7 @@ static void sysrq_do_reset(unsigned long _state) static void sysrq_handle_reset_request(struct sysrq_state *state) { if (state->reset_requested) - __handle_sysrq(sysrq_xlate[KEY_B], false); + __handle_sysrq(sysrq_xlate[KEY_B], SYSRQ_FROM_KERNEL); if (sysrq_reset_downtime_ms) mod_timer(>keyreset_timer, @@ -812,8 +817,10 @@ static bool sysrq_handle_keypress(struct sysrq_state *sysrq, default: if (sysrq->active && value && value != 2) { + int from = sysrq->handle.dev->flags & INPUTDEV_FLAGS_SYNTHETIC ? + SYSRQ_FROM_SYNTHETIC : 0; sysrq->need_reinject = false; - __handle_sysrq(sysrq_xlate[code], true); + __handle_sysrq(sysrq_xlate[code], from); } break; } @@ -1097,7 +1104,7 @@ static ssize_t write_sysrq_trigger(struct file *file, const char __user *buf, if
Re: [PATCH 18/27] bpf: Restrict kernel image access functions when the kernel is locked down
Alexei Starovoitovwrote: > > TBH, I've no idea how bpf does anything, so I can't say whether this is > > better, overkill or insufficient. > > ok. To make it clear: > Nacked-by: Alexei Starovoitov > For the current patch. > Unnecessary checks for no good reason in performance critical > functions are not acceptable. They aren't unnecessary checks. Can you please suggest if there's some way more suitable than just killing bpf entirely? I don't know the code, and I presume you do. David -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/3] arm64: Add software workaround for Falkor erratum 1041
James, Looks like my VM test raised a false alarm. I retested stock Artful 4.13 kernel (No erratum 1041 patches applied). Host: Ubuntu Artful 4.13 kernel with *no* erratum 1041 patches applied. Guest: Ubuntu Zesty (4.10) kernel. - Created 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. And, I am able to reproduce the system reset issue I previously reported. I think the problem I reported with VMs might have nothing to do with the erratum 1041 patches, and probably needs to be root caused seperately. With stock 4.13 kernel (no erratum 1041 patches applied): awrep6 login: [ 461.881379] ACPI CPPC: PCC check channel failed. Status=0 [ 462.051194] ACPI CPPC: PCC check channel failed. Status=0 [ 462.223137] ACPI CPPC: PCC check channel failed. Status=0 [ 462.633790] ACPI CPPC: PCC check channel failed. Status=0 [ 463.231971] ACPI CPPC: PCC check channel failed. Status=0 [ 463.403163] ACPI CPPC: PCC check channel failed. Status=0 [ 463.822936] ACPI CPPC: PCC check channel failed. Status=0 [ 463.995222] ACPI CPPC: PCC check channel failed. Status=0 [ 464.130962] ACPI CPPC: PCC check channel failed. Status=0 [ 464.258973] ACPI CPPC: PCC check channel failed. Status=0 [ 465.283028] ACPI CPPC: PCC check channel failed. Status=0 SYS_DBG: Running SDI image (immediate mode) SYS_DBG: Ram Dump Init SYS_DBG: Failed to init SD card SYS_DBG: Resetting system! On Thu, 9 Nov 2017, Manoj Iyer wrote: On Thu, 9 Nov 2017, Manoj Iyer wrote: James, (sorry for top-posting) Applied patch 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic ) - Start 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. Fixing some confusion I might have introduced in my prev email. - Applied all 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic ) - Created 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. The system reset's itself after starting the last VM on the 1st loop displaying the following: awrep6 login: [ 603.349141] ACPI CPPC: PCC check channel failed. Status=0 [ 603.765101] ACPI CPPC: PCC check channel failed. Status=0 [ 603.937389] ACPI CPPC: PCC check channel failed. Status=0 [ 608.285495] ACPI CPPC: PCC check channel failed. Status=0 [ 608.289481] ACPI CPPC: PCC check channel failed. Status=0 SYS_DBG: Running SDI image (immediate mode) SYS_DBG: Ram Dump Init SYS_DBG: Failed to init SD card SYS_DBG: Resetting system! Followed by the following messages on system reboot: [ 6.616891] BERT: Error records from previous boot: [ 6.621655] [Hardware Error]: event severity: fatal [ 6.626516] [Hardware Error]: imprecise tstamp: -00-00 00:00:00 [ 6.632851] [Hardware Error]: Error 0, type: fatal [ 6.637713] [Hardware Error]: section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b [ 6.646045] [Hardware Error]: section length: 0x238 [ 6.651082] [Hardware Error]: : 72724502 5220726f 6f736165 6e55206e .Error Reason Un [ 6.659761] [Hardware Error]: 0010: 776f6e6b 006e known... [ 6.668442] [Hardware Error]: 0020: [ 6.677122] [Hardware Error]: 0030: On Thu, 9 Nov 2017, James Morse wrote: Hi Manoj, On 08/11/17 19:05, Manoj Iyer wrote: On Thu, 2 Nov 2017, Shanker Donthineni wrote: The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter and next 4KB. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. I applied the 3 patches to Ubuntu 4.13.0-16-generic (Artful) kernel and ran stress-ng cpu tests on QDF2400 server [...] Where stress-ng would spawn N workers and test cpu offline/online, perform matrix operations, do rapid context switchs, and anonymous mmaps. Although I was not able to reproduce the erratum on the stock 4.13 kernel using the same test case, the patched kernel did not seem to introduce any regressions either. I ran the stress-ng tests for over 8hrs found the system to be stable. Could you throw kexec and KVM into the mix? This issue only shows up when we disable the MMU, which we almost never do. For CPU offline/online we make the PSCI 'offline' call with the MMU enabled. When the CPU comes back firmware has reset the EL2/EL1 SCTLR from a higher
Re: [3/3] arm64: Add software workaround for Falkor erratum 1041
On Thu, 9 Nov 2017, Manoj Iyer wrote: James, (sorry for top-posting) Applied patch 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic ) - Start 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. Fixing some confusion I might have introduced in my prev email. - Applied all 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic ) - Created 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. The system reset's itself after starting the last VM on the 1st loop displaying the following: awrep6 login: [ 603.349141] ACPI CPPC: PCC check channel failed. Status=0 [ 603.765101] ACPI CPPC: PCC check channel failed. Status=0 [ 603.937389] ACPI CPPC: PCC check channel failed. Status=0 [ 608.285495] ACPI CPPC: PCC check channel failed. Status=0 [ 608.289481] ACPI CPPC: PCC check channel failed. Status=0 SYS_DBG: Running SDI image (immediate mode) SYS_DBG: Ram Dump Init SYS_DBG: Failed to init SD card SYS_DBG: Resetting system! Followed by the following messages on system reboot: [ 6.616891] BERT: Error records from previous boot: [ 6.621655] [Hardware Error]: event severity: fatal [ 6.626516] [Hardware Error]: imprecise tstamp: -00-00 00:00:00 [ 6.632851] [Hardware Error]: Error 0, type: fatal [ 6.637713] [Hardware Error]: section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b [ 6.646045] [Hardware Error]: section length: 0x238 [ 6.651082] [Hardware Error]: : 72724502 5220726f 6f736165 6e55206e .Error Reason Un [ 6.659761] [Hardware Error]: 0010: 776f6e6b 006e known... [ 6.668442] [Hardware Error]: 0020: [ 6.677122] [Hardware Error]: 0030: On Thu, 9 Nov 2017, James Morse wrote: Hi Manoj, On 08/11/17 19:05, Manoj Iyer wrote: On Thu, 2 Nov 2017, Shanker Donthineni wrote: The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter and next 4KB. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. I applied the 3 patches to Ubuntu 4.13.0-16-generic (Artful) kernel and ran stress-ng cpu tests on QDF2400 server [...] Where stress-ng would spawn N workers and test cpu offline/online, perform matrix operations, do rapid context switchs, and anonymous mmaps. Although I was not able to reproduce the erratum on the stock 4.13 kernel using the same test case, the patched kernel did not seem to introduce any regressions either. I ran the stress-ng tests for over 8hrs found the system to be stable. Could you throw kexec and KVM into the mix? This issue only shows up when we disable the MMU, which we almost never do. For CPU offline/online we make the PSCI 'offline' call with the MMU enabled. When the CPU comes back firmware has reset the EL2/EL1 SCTLR from a higher exception level, so it won't hit this issue. One place we do this is kexec, where we drop into purgatory with the MMU disabled. The other is KVM unloading itself to return to the hyp stub. You can stress this by starting and stopping a VM. When the number of VMs reaches 0 KVM should unload via 'kvm_arch_hardware_disable()'. Thanks, James -- Manoj Iyer Ubuntu/Canonical ARM Servers - Cloud -- Manoj Iyer Ubuntu/Canonical ARM Servers - Cloud -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/3] arm64: Add software workaround for Falkor erratum 1041
James, (sorry for top-posting) Applied patch 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic ) - Start 20 VMs one at a time In a loop: - Stop (virsh destroy) 20 VMs one at a time - Start (virsh start) 20 VMs one at a time. The system reset's itself after starting the last VM on the 1st loop displaying the following: awrep6 login: [ 603.349141] ACPI CPPC: PCC check channel failed. Status=0 [ 603.765101] ACPI CPPC: PCC check channel failed. Status=0 [ 603.937389] ACPI CPPC: PCC check channel failed. Status=0 [ 608.285495] ACPI CPPC: PCC check channel failed. Status=0 [ 608.289481] ACPI CPPC: PCC check channel failed. Status=0 SYS_DBG: Running SDI image (immediate mode) SYS_DBG: Ram Dump Init SYS_DBG: Failed to init SD card SYS_DBG: Resetting system! Followed by the following messages on system reboot: [ 6.616891] BERT: Error records from previous boot: [ 6.621655] [Hardware Error]: event severity: fatal [ 6.626516] [Hardware Error]: imprecise tstamp: -00-00 00:00:00 [ 6.632851] [Hardware Error]: Error 0, type: fatal [ 6.637713] [Hardware Error]: section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b [ 6.646045] [Hardware Error]: section length: 0x238 [ 6.651082] [Hardware Error]: : 72724502 5220726f 6f736165 6e55206e .Error Reason Un [ 6.659761] [Hardware Error]: 0010: 776f6e6b 006e known... [ 6.668442] [Hardware Error]: 0020: [ 6.677122] [Hardware Error]: 0030: On Thu, 9 Nov 2017, James Morse wrote: Hi Manoj, On 08/11/17 19:05, Manoj Iyer wrote: On Thu, 2 Nov 2017, Shanker Donthineni wrote: The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter and next 4KB. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. I applied the 3 patches to Ubuntu 4.13.0-16-generic (Artful) kernel and ran stress-ng cpu tests on QDF2400 server [...] Where stress-ng would spawn N workers and test cpu offline/online, perform matrix operations, do rapid context switchs, and anonymous mmaps. Although I was not able to reproduce the erratum on the stock 4.13 kernel using the same test case, the patched kernel did not seem to introduce any regressions either. I ran the stress-ng tests for over 8hrs found the system to be stable. Could you throw kexec and KVM into the mix? This issue only shows up when we disable the MMU, which we almost never do. For CPU offline/online we make the PSCI 'offline' call with the MMU enabled. When the CPU comes back firmware has reset the EL2/EL1 SCTLR from a higher exception level, so it won't hit this issue. One place we do this is kexec, where we drop into purgatory with the MMU disabled. The other is KVM unloading itself to return to the hyp stub. You can stress this by starting and stopping a VM. When the number of VMs reaches 0 KVM should unload via 'kvm_arch_hardware_disable()'. Thanks, James -- Manoj Iyer Ubuntu/Canonical ARM Servers - Cloud -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
Hi James, On 11/09/2017 05:08 AM, James Morse wrote: > Hi Shanker, Robin, > > On 04/11/17 21:43, Shanker Donthineni wrote: >> On 11/03/2017 10:11 AM, Robin Murphy wrote: >>> On 03/11/17 03:27, Shanker Donthineni wrote: The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter and next 4KB. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. > > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index b6dfb4f..4c91efb 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -30,6 +30,7 @@ #include #include #include +#include /* * Enable and disable interrupts. @@ -514,6 +515,22 @@ * reg: the value to be written. */ .macro write_sctlr, eln, reg +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + tbnz\reg, #0, 8000f // enable MMU? > > Won't this match any change that leaves the MMU enabled? > Yes. No need to apply workaround if the MMU is going to be enabled. > I think the macro is making this more confusing. Disabling the MMU is obvious > from the call-site, (and really rare!). Trying to work it out from a macro > makes > it more complicated than necessary. > Not clear, are you suggesting not to use read{write}_sctlr() macros instead apply the workaround from the call-site based on the MMU-on status? If yes, It simplifies the code logic but CONFIG_QCOM_FALKOR_ERRATUM_1041 references are scatter everywhere. > >>> Do we really need the branch here? It's not like enabling the MMU is >>> something we do on the syscall fastpath, and I can't imagine an extra >>> ISB hurts much (and is probably comparable to a mispredicted branch > >> I don't have any strong opinion on whether to use an ISB conditionally >> or unconditionally. Yes, the current kernel code is not touching >> SCTLR_ELn register on the system call fast path. I would like to keep >> it as a conditional ISB in case if the future kernel accesses the >> SCTLR_ELn on the fast path. An extra ISB should not hurt a lot but I >> believe it has more overhead than the TBZ+branch mis-prediction on Falkor >> CPU. This patch has been tested on the real hardware to fix the problem. > >> I'm open to change to an unconditional ISB if it's the better fix. >> >>> anyway). In fact, is there any noticeable hit on other >>> microarchitectures if we save the alternative bother and just do it >>> unconditionally always? >>> >> >> I can't comment on the performance impacts of other CPUs since I don't >> have access to their development platforms. I'll prefer alternatives >> just to avoid the unnecessary overhead on future Qualcomm Datacenter >> server CPUs and regression on other CPUs because of inserting an ISB > > I think hiding errata on other CPUs is a good argument. > > My suggestion would be: >> #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 >> isb >> #endif > > In head.S and efi-entry.S, as these run before alternatives. > Then use alternatives to add just the isb in the mmu-off path for the other > callers. > > Thanks for your opinion on this one, I'll change to an unconditional ISB in v2 patch. After this change the enable_mmu() issues two ISBs before writing to SCTLR_EL1. Are you okay with this behavior? ENTRY(__enable_mmu) mrs x1, ID_AA64MMFR0_EL1 ubfxx2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4 cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED b.ne__no_granule_support update_early_cpu_boot_status 0, x1, x2 adrpx1, idmap_pg_dir adrpx2, swapper_pg_dir msr ttbr0_el1, x1 // load TTBR0
Re: [PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
Hi Shanker, Robin, On 04/11/17 21:43, Shanker Donthineni wrote: > On 11/03/2017 10:11 AM, Robin Murphy wrote: >> On 03/11/17 03:27, Shanker Donthineni wrote: >>> The ARM architecture defines the memory locations that are permitted >>> to be accessed as the result of a speculative instruction fetch from >>> an exception level for which all stages of translation are disabled. >>> Specifically, the core is permitted to speculatively fetch from the >>> 4KB region containing the current program counter and next 4KB. >>> >>> When translation is changed from enabled to disabled for the running >>> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >>> Falkor core may errantly speculatively access memory locations outside >>> of the 4KB region permitted by the architecture. The errant memory >>> access may lead to one of the following unexpected behaviors. >>> >>> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>>to the errant memory access attempting to access a region of memory >>>that is protected by a slave-side memory protection unit. >>> 2) Unpredictable device behavior due to a speculative read from device >>>memory. This behavior may only occur if the instruction cache is >>>disabled prior to or coincident with translation being changed from >>>enabled to disabled. >>> >>> To avoid the errant behavior, software must execute an ISB immediately >>> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. >>> diff --git a/arch/arm64/include/asm/assembler.h >>> b/arch/arm64/include/asm/assembler.h >>> index b6dfb4f..4c91efb 100644 >>> --- a/arch/arm64/include/asm/assembler.h >>> +++ b/arch/arm64/include/asm/assembler.h >>> @@ -30,6 +30,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> /* >>> * Enable and disable interrupts. >>> @@ -514,6 +515,22 @@ >>> * reg: the value to be written. >>> */ >>> .macro write_sctlr, eln, reg >>> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 >>> +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 >>> + tbnz\reg, #0, 8000f // enable MMU? Won't this match any change that leaves the MMU enabled? I think the macro is making this more confusing. Disabling the MMU is obvious from the call-site, (and really rare!). Trying to work it out from a macro makes it more complicated than necessary. >> Do we really need the branch here? It's not like enabling the MMU is >> something we do on the syscall fastpath, and I can't imagine an extra >> ISB hurts much (and is probably comparable to a mispredicted branch > I don't have any strong opinion on whether to use an ISB conditionally > or unconditionally. Yes, the current kernel code is not touching > SCTLR_ELn register on the system call fast path. I would like to keep > it as a conditional ISB in case if the future kernel accesses the > SCTLR_ELn on the fast path. An extra ISB should not hurt a lot but I > believe it has more overhead than the TBZ+branch mis-prediction on Falkor > CPU. This patch has been tested on the real hardware to fix the problem. > I'm open to change to an unconditional ISB if it's the better fix. > >> anyway). In fact, is there any noticeable hit on other >> microarchitectures if we save the alternative bother and just do it >> unconditionally always? >> > > I can't comment on the performance impacts of other CPUs since I don't > have access to their development platforms. I'll prefer alternatives > just to avoid the unnecessary overhead on future Qualcomm Datacenter > server CPUs and regression on other CPUs because of inserting an ISB I think hiding errata on other CPUs is a good argument. My suggestion would be: > #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 > isb > #endif In head.S and efi-entry.S, as these run before alternatives. Then use alternatives to add just the isb in the mmu-off path for the other callers. > prior to SCTLR_ELn register update. Let's see what rest of the ARM > maintainers think about always using an ISB instead of alternatives. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/3] arm64: Add software workaround for Falkor erratum 1041
Hi Manoj, On 08/11/17 19:05, Manoj Iyer wrote: > On Thu, 2 Nov 2017, Shanker Donthineni wrote: >> The ARM architecture defines the memory locations that are permitted >> to be accessed as the result of a speculative instruction fetch from >> an exception level for which all stages of translation are disabled. >> Specifically, the core is permitted to speculatively fetch from the >> 4KB region containing the current program counter and next 4KB. >> >> When translation is changed from enabled to disabled for the running >> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >> Falkor core may errantly speculatively access memory locations outside >> of the 4KB region permitted by the architecture. The errant memory >> access may lead to one of the following unexpected behaviors. > I applied the 3 patches to Ubuntu 4.13.0-16-generic (Artful) kernel and > ran stress-ng cpu tests on QDF2400 server [...] > Where stress-ng would spawn N workers and test cpu offline/online, perform > matrix operations, do rapid context switchs, and anonymous mmaps. Although > I was not able to reproduce the erratum on the stock 4.13 kernel using the > same test case, the patched kernel did not seem to introduce any > regressions either. I ran the stress-ng tests for over 8hrs found the > system to be stable. Could you throw kexec and KVM into the mix? This issue only shows up when we disable the MMU, which we almost never do. For CPU offline/online we make the PSCI 'offline' call with the MMU enabled. When the CPU comes back firmware has reset the EL2/EL1 SCTLR from a higher exception level, so it won't hit this issue. One place we do this is kexec, where we drop into purgatory with the MMU disabled. The other is KVM unloading itself to return to the hyp stub. You can stress this by starting and stopping a VM. When the number of VMs reaches 0 KVM should unload via 'kvm_arch_hardware_disable()'. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html