Re: [PATCH 2/2] efifb: Avoid reconfiguration of BAR that covers the framebuffer
On 19 May 2017 at 21:44, Bjorn Helgaas wrote: > On Fri, May 19, 2017 at 05:37:30PM +0100, Ard Biesheuvel wrote: >> Hi Bjorn, >> >> On 19 May 2017 at 17:27, Bjorn Helgaas wrote: >> > [+cc linux-pci] >> > >> > On Tue, Apr 04, 2017 at 04:27:44PM +0100, Ard Biesheuvel wrote: >> >> On UEFI systems, the PCI subsystem is enumerated by the firmware, >> >> and if a graphical framebuffer is exposed by a PCI device, its base >> >> address and size are exposed to the OS via the Graphics Output >> >> Protocol (GOP). >> >> >> >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >> >> scratch at boot. This may result in the GOP framebuffer address to >> >> become stale, if the BAR covering the framebuffer is modified. This >> >> will cause the framebuffer to become unresponsive, and may in some >> >> cases result in unpredictable behavior if the range is reassigned to >> >> another device. >> >> >> >> So add a non-x86 quirk to the EFI fb driver to find the BAR associated >> >> with the GOP base address, and claim the BAR resource so that the PCI >> >> core will not move it. >> > >> > I know this has already been merged as 55d728a40d36 ("efi/fb: Avoid >> > reconfiguration of BAR that covers the framebuffer"), and I'm not >> > suggesting that we revert it, but I have some misgivings. >> ... > >> > Another is the use of pci_claim_resource() to express the idea that "this >> > BAR can not be moved". We have IORESOURCE_PCI_FIXED for that purpose, and >> > previous versions of the patch used that. I understand there was some >> > problem with that, but I wish we could figure out and fix that problem >> > instead of using a different mechanism. >> >> OK. The problem was that IORESOURCE_PCI_FIXED does not prevent the PCI >> subsystem from handing out the same range to another device. > > Yes, this would definitely be a problem. There must be a path where > we start doing the reassignment before we claim the resources that > have already been assigned. That's what seems backwards to me -- it > seems like we should claim things that are valid first so we know to > avoid them. > >> > I'm not even completely sold on the idea that we need to prevent the BAR >> > from being moved. I'm not a UEFI expert, but if this requirement is in the >> > spec, we should reference it. If not, it should be sufficient to remember >> > the boot-time BAR value, match the GOP base to *that*, and map it to >> > whatever the current BAR value is. >> >> There is no such requirement in the spec. The graphics output protocol >> is not described in terms of PCI, BARs etc. The framebuffer is simply >> a memory range with side effects that is left enabled when handing >> over to the OS. >> >> In summary, I am as unhappy with the patch as you are, but it is still >> an improvement over the previous situation, so let's simply >> collaborate to get this into shape going forward. >> >> My preference would be to investigate IORESOURCE_PCI_FIXED again, >> because that still seems like the best way to deal with a live >> framebuffer on a PCI device that has memory decoding enabled. It >> should also address the issue with the current version of the patch, >> i.e., that claiming resources at this point is not possible if the >> device resides behind a bridge. >> >> So is there any guidance you can give as to how to proceed? If we set >> IORESOURCE_PCI_FIXED, how should be prevent the PCI layer from >> assigning this resource elsewhere if we cannot claim it yet? > > My preference would actually be to remember the boot BAR values and > map to the current values because that avoids the unnecessary > restriction. The IORESOURCE_PCI_FIXED uses that seem legitimate to me > are the legacy VGA and IDE things (stuff that's not described by BARs > and *can't* be moved) and the new "Enhanced Allocation" stuff > (basically a way to describe more non-BAR resources). > > We do something sort of similar with pci_revert_fw_address(), although > it's currently only implemented for x86. I could imagine a more > generic solution that could be used for this GOP issue and could also > replace pci_revert_fw_address(). > I already proposed something like this a while ago: http://marc.info//?l=linux-fbdev&m=149190021316410&w=2 > Conceptually it could be as simple as adding 7 u64 boot-time BAR > values (6 BARs + the ROM) to struct pci_dev. We went to a lot of > trouble to implement the pcibios_fwaddrmap stuff on x86, and I can't > remember why it's x86-specific. Maybe we thought the extra 56 bytes > per dev was too much overhead (it does seem like a lot for such a > limited use case). Maybe there's a clever way to store just the BARs > we actually change and keep that long enough for efifb to use it. > > It *would* be nice to fix the problem with IORESOURCE_PCI_FIXED, and I > think it would help clean up PCI resource management a lot. Ideally > we would be able to claim host bridge resources even before scanning > the bus, then claim BARs immediately when we d
Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME
On 5/19/2017 4:28 PM, Borislav Petkov wrote: On Fri, May 19, 2017 at 04:07:24PM -0500, Tom Lendacky wrote: As long as those never change from static inline everything will be fine. I can change it, but I really like how it explicitly indicates I know what you want to do. But you're practically defining a helper which contains two arbitrary instructions which probably no one else will need. So how about we simplify this function even more. We don't need to pay attention to kexec being in progress because we're halting anyway so who cares how fast we halt. Might have to state that in the comment below though, instead of what's there now. And for the exact same moot reason, we don't need to look at SME CPUID feature - we can just as well WBINVD unconditionally. void stop_this_cpu(void *dummy) { local_irq_disable(); /* * Remove this CPU: */ set_cpu_online(smp_processor_id(), false); disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); for (;;) { /* * If we are performing a kexec and the processor supports * SME then we need to clear out cache information before * halting. With kexec, going from SME inactive to SME active * requires clearing cache entries so that addresses without * the encryption bit set don't corrupt the same physical * address that has the encryption bit set when caches are * flushed. Perform a wbinvd followed by a halt to achieve * this. */ asm volatile("wbinvd; hlt" ::: "memory"); } } How's that? I can live with that! Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME
On Fri, May 19, 2017 at 04:07:24PM -0500, Tom Lendacky wrote: > As long as those never change from static inline everything will be > fine. I can change it, but I really like how it explicitly indicates I know what you want to do. But you're practically defining a helper which contains two arbitrary instructions which probably no one else will need. So how about we simplify this function even more. We don't need to pay attention to kexec being in progress because we're halting anyway so who cares how fast we halt. Might have to state that in the comment below though, instead of what's there now. And for the exact same moot reason, we don't need to look at SME CPUID feature - we can just as well WBINVD unconditionally. void stop_this_cpu(void *dummy) { local_irq_disable(); /* * Remove this CPU: */ set_cpu_online(smp_processor_id(), false); disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); for (;;) { /* * If we are performing a kexec and the processor supports * SME then we need to clear out cache information before * halting. With kexec, going from SME inactive to SME active * requires clearing cache entries so that addresses without * the encryption bit set don't corrupt the same physical * address that has the encryption bit set when caches are * flushed. Perform a wbinvd followed by a halt to achieve * this. */ asm volatile("wbinvd; hlt" ::: "memory"); } } How's that? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME
On 5/19/2017 3:58 PM, Borislav Petkov wrote: On Fri, May 19, 2017 at 03:45:28PM -0500, Tom Lendacky wrote: Actually there is. The above will result in data in the cache because halt() turns into a function call if CONFIG_PARAVIRT is defined (refer to the comment above where do_wbinvd_halt is set to true). I could make this a native_wbinvd() and native_halt() That's why we have the native_* versions - to bypass paravirt crap. As long as those never change from static inline everything will be fine. I can change it, but I really like how it explicitly indicates what is needed in this case. Even if the function gets changed from static inline the fact that the instructions are sequential in the function covers that case. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME
On Fri, May 19, 2017 at 03:45:28PM -0500, Tom Lendacky wrote: > Actually there is. The above will result in data in the cache because > halt() turns into a function call if CONFIG_PARAVIRT is defined (refer > to the comment above where do_wbinvd_halt is set to true). I could make > this a native_wbinvd() and native_halt() That's why we have the native_* versions - to bypass paravirt crap. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 17/32] x86/mm: Add support to access boot related data in the clear
On 5/18/2017 4:02 AM, Borislav Petkov wrote: On Wed, May 17, 2017 at 01:54:39PM -0500, Tom Lendacky wrote: I was worried what the compiler might do when CONFIG_EFI is not set, but it appears to take care of it. I'll double check though. There's a efi_enabled() !CONFIG_EFI version too, so should be fine. I may introduce a length variable to capture data->len right after paddr_next is set and then have just a single memunmap() call before the if check. Yap. I tried that, but calling an "__init" function (early_memremap()) from a non "__init" function generated warnings. I suppose I can pass in a function for the map and unmap but that looks worse to me (also the unmap functions take different arguments). No, the other way around: the __init function should call the non-init one and you need the non-init one anyway for memremap_is_setup_data(). The "worker" function would be doing the loop through the setup data, but since the setup data is mapped inside the loop I can't do the __init calling the non-init function and still hope to consolidate the code. Maybe I'm missing something here... Thanks, Tom This is like the chicken and the egg scenario. In order to determine if an address is setup data I have to explicitly map the setup data chain as decrypted. In order to do that I have to supply a flag to explicitly map the data decrypted otherwise I wind up back in the memremap_is_setup_data() function again and again and again... Oh, fun. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 28/32] x86/mm, kexec: Allow kexec to be used with SME
On 5/17/2017 2:17 PM, Borislav Petkov wrote: On Tue, Apr 18, 2017 at 04:21:21PM -0500, Tom Lendacky wrote: Provide support so that kexec can be used to boot a kernel when SME is enabled. Support is needed to allocate pages for kexec without encryption. This is needed in order to be able to reboot in the kernel in the same manner as originally booted. Additionally, when shutting down all of the CPUs we need to be sure to flush the caches and then halt. This is needed when booting from a state where SME was not active into a state where SME is active (or vice-versa). Without these steps, it is possible for cache lines to exist for the same physical location but tagged both with and without the encryption bit. This can cause random memory corruption when caches are flushed depending on which cacheline is written last. Signed-off-by: Tom Lendacky --- arch/x86/include/asm/init.h |1 + arch/x86/include/asm/irqflags.h |5 + arch/x86/include/asm/kexec.h |8 arch/x86/include/asm/pgtable_types.h |1 + arch/x86/kernel/machine_kexec_64.c | 35 +- arch/x86/kernel/process.c| 26 +++-- arch/x86/mm/ident_map.c | 11 +++ include/linux/kexec.h| 14 ++ kernel/kexec_core.c |7 +++ 9 files changed, 101 insertions(+), 7 deletions(-) ... @@ -86,7 +86,7 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd) set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE)); } pte = pte_offset_kernel(pmd, vaddr); - set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC)); + set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC_NOENC)); return 0; err: free_transition_pgtable(image); @@ -114,6 +114,7 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable) .alloc_pgt_page = alloc_pgt_page, .context= image, .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, + .kernpg_flag= _KERNPG_TABLE_NOENC, }; unsigned long mstart, mend; pgd_t *level4p; @@ -597,3 +598,35 @@ void arch_kexec_unprotect_crashkres(void) { kexec_mark_crashkres(false); } + +int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp) +{ + int ret; + + if (sme_active()) { if (!sme_active()) return 0; /* * If SME... Ok. + /* +* If SME is active we need to be sure that kexec pages are +* not encrypted because when we boot to the new kernel the +* pages won't be accessed encrypted (initially). +*/ + ret = set_memory_decrypted((unsigned long)vaddr, pages); + if (ret) + return ret; + + if (gfp & __GFP_ZERO) + memset(vaddr, 0, pages * PAGE_SIZE); This function is called after alloc_pages() which already zeroes memory when __GFP_ZERO is supplied. If you need to clear the memory *after* set_memory_encrypted() happens, then you should probably mask out __GFP_ZERO before the alloc_pages() call so as not to do it twice. I'll look into that. I could put the memset() at the end of this function so that it is done here no matter what. And update the default arch_kexec_post_alloc_pages() to also do the memset(). It just hides the clearing of the pages a bit though by doing that. + } + + return 0; +} + +void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) +{ + if (sme_active()) { + /* +* If SME is active we need to reset the pages back to being +* an encrypted mapping before freeing them. +*/ + set_memory_encrypted((unsigned long)vaddr, pages); + } +} diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 0bb8842..f4e5de6 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -355,8 +356,25 @@ bool xen_set_default_idle(void) return ret; } #endif + void stop_this_cpu(void *dummy) { + bool do_wbinvd_halt = false; + + if (kexec_in_progress && boot_cpu_has(X86_FEATURE_SME)) { + /* +* If we are performing a kexec and the processor supports +* SME then we need to clear out cache information before +* halting. With kexec, going from SME inactive to SME active +* requires clearing cache entries so that addresses without +* the encryption bit set don't corrupt the same physical +* address that has the encryption bit set when caches are +* flushed. Perform a wbinvd fol
Re: [PATCH 2/2] efifb: Avoid reconfiguration of BAR that covers the framebuffer
On Fri, May 19, 2017 at 05:37:30PM +0100, Ard Biesheuvel wrote: > Hi Bjorn, > > On 19 May 2017 at 17:27, Bjorn Helgaas wrote: > > [+cc linux-pci] > > > > On Tue, Apr 04, 2017 at 04:27:44PM +0100, Ard Biesheuvel wrote: > >> On UEFI systems, the PCI subsystem is enumerated by the firmware, > >> and if a graphical framebuffer is exposed by a PCI device, its base > >> address and size are exposed to the OS via the Graphics Output > >> Protocol (GOP). > >> > >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > >> scratch at boot. This may result in the GOP framebuffer address to > >> become stale, if the BAR covering the framebuffer is modified. This > >> will cause the framebuffer to become unresponsive, and may in some > >> cases result in unpredictable behavior if the range is reassigned to > >> another device. > >> > >> So add a non-x86 quirk to the EFI fb driver to find the BAR associated > >> with the GOP base address, and claim the BAR resource so that the PCI > >> core will not move it. > > > > I know this has already been merged as 55d728a40d36 ("efi/fb: Avoid > > reconfiguration of BAR that covers the framebuffer"), and I'm not > > suggesting that we revert it, but I have some misgivings. > ... > > Another is the use of pci_claim_resource() to express the idea that "this > > BAR can not be moved". We have IORESOURCE_PCI_FIXED for that purpose, and > > previous versions of the patch used that. I understand there was some > > problem with that, but I wish we could figure out and fix that problem > > instead of using a different mechanism. > > OK. The problem was that IORESOURCE_PCI_FIXED does not prevent the PCI > subsystem from handing out the same range to another device. Yes, this would definitely be a problem. There must be a path where we start doing the reassignment before we claim the resources that have already been assigned. That's what seems backwards to me -- it seems like we should claim things that are valid first so we know to avoid them. > > I'm not even completely sold on the idea that we need to prevent the BAR > > from being moved. I'm not a UEFI expert, but if this requirement is in the > > spec, we should reference it. If not, it should be sufficient to remember > > the boot-time BAR value, match the GOP base to *that*, and map it to > > whatever the current BAR value is. > > There is no such requirement in the spec. The graphics output protocol > is not described in terms of PCI, BARs etc. The framebuffer is simply > a memory range with side effects that is left enabled when handing > over to the OS. > > In summary, I am as unhappy with the patch as you are, but it is still > an improvement over the previous situation, so let's simply > collaborate to get this into shape going forward. > > My preference would be to investigate IORESOURCE_PCI_FIXED again, > because that still seems like the best way to deal with a live > framebuffer on a PCI device that has memory decoding enabled. It > should also address the issue with the current version of the patch, > i.e., that claiming resources at this point is not possible if the > device resides behind a bridge. > > So is there any guidance you can give as to how to proceed? If we set > IORESOURCE_PCI_FIXED, how should be prevent the PCI layer from > assigning this resource elsewhere if we cannot claim it yet? My preference would actually be to remember the boot BAR values and map to the current values because that avoids the unnecessary restriction. The IORESOURCE_PCI_FIXED uses that seem legitimate to me are the legacy VGA and IDE things (stuff that's not described by BARs and *can't* be moved) and the new "Enhanced Allocation" stuff (basically a way to describe more non-BAR resources). We do something sort of similar with pci_revert_fw_address(), although it's currently only implemented for x86. I could imagine a more generic solution that could be used for this GOP issue and could also replace pci_revert_fw_address(). Conceptually it could be as simple as adding 7 u64 boot-time BAR values (6 BARs + the ROM) to struct pci_dev. We went to a lot of trouble to implement the pcibios_fwaddrmap stuff on x86, and I can't remember why it's x86-specific. Maybe we thought the extra 56 bytes per dev was too much overhead (it does seem like a lot for such a limited use case). Maybe there's a clever way to store just the BARs we actually change and keep that long enough for efifb to use it. It *would* be nice to fix the problem with IORESOURCE_PCI_FIXED, and I think it would help clean up PCI resource management a lot. Ideally we would be able to claim host bridge resources even before scanning the bus, then claim BARs immediately when we discover them. That would allow us to automatically use any setup done by firmware, and reassign anything that we couldn't claim. But I think this will end up being difficult because we currently do the PCI bus scan before looking at ACPI resources, and sometimes those
[PATCH V17 04/11] efi: parse ARM processor error
Add support for ARM Common Platform Error Record (CPER). UEFI 2.6 specification adds support for ARM specific processor error information to be reported as part of the CPER records. This provides more detail on for processor error logs. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Reviewed-by: James Morse Reviewed-by: Ard Biesheuvel --- drivers/firmware/efi/cper.c | 129 include/linux/cper.h| 54 +++ 2 files changed, 183 insertions(+) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index 229cf92..eac0854 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -110,12 +110,15 @@ void cper_print_bits(const char *pfx, unsigned int bits, static const char * const proc_type_strs[] = { "IA32/X64", "IA64", + "ARM", }; static const char * const proc_isa_strs[] = { "IA32", "IA64", "X64", + "ARM A32/T32", + "ARM A64", }; static const char * const proc_error_type_strs[] = { @@ -184,6 +187,122 @@ static void cper_print_proc_generic(const char *pfx, printk("%s""IP: 0x%016llx\n", pfx, proc->ip); } +#if defined(CONFIG_ARM64) || defined(CONFIG_ARM) +static const char * const arm_reg_ctx_strs[] = { + "AArch32 general purpose registers", + "AArch32 EL1 context registers", + "AArch32 EL2 context registers", + "AArch32 secure context registers", + "AArch64 general purpose registers", + "AArch64 EL1 context registers", + "AArch64 EL2 context registers", + "AArch64 EL3 context registers", + "Misc. system register structure", +}; + +static void cper_print_proc_arm(const char *pfx, + const struct cper_sec_proc_arm *proc) +{ + int i, len, max_ctx_type; + struct cper_arm_err_info *err_info; + struct cper_arm_ctx_info *ctx_info; + char newpfx[64]; + + printk("%sMIDR: 0x%016llx\n", pfx, proc->midr); + + len = proc->section_length - (sizeof(*proc) + + proc->err_info_num * (sizeof(*err_info))); + if (len < 0) { + printk("%ssection length: %d\n", pfx, proc->section_length); + printk("%ssection length is too small\n", pfx); + printk("%sfirmware-generated error record is incorrect\n", pfx); + printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num); + return; + } + + if (proc->validation_bits & CPER_ARM_VALID_MPIDR) + printk("%sMultiprocessor Affinity Register (MPIDR): 0x%016llx\n", + pfx, proc->mpidr); + + if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL) + printk("%serror affinity level: %d\n", pfx, + proc->affinity_level); + + if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) { + printk("%srunning state: 0x%x\n", pfx, proc->running_state); + printk("%sPower State Coordination Interface state: %d\n", + pfx, proc->psci_state); + } + + snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP); + + err_info = (struct cper_arm_err_info *)(proc + 1); + for (i = 0; i < proc->err_info_num; i++) { + printk("%sError info structure %d:\n", pfx, i); + + printk("%snum errors: %d\n", pfx, err_info->multiple_error + 1); + + if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) { + if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST) + printk("%sfirst error captured\n", newpfx); + if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST) + printk("%slast error captured\n", newpfx); + if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED) + printk("%spropagated error captured\n", + newpfx); + if (err_info->flags & CPER_ARM_INFO_FLAGS_OVERFLOW) + printk("%soverflow occurred, error info is incomplete\n", + newpfx); + } + + printk("%serror_type: %d, %s\n", newpfx, err_info->type, + err_info->type < ARRAY_SIZE(proc_error_type_strs) ? + proc_error_type_strs[err_info->type] : "unknown"); + if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) + printk("%serror_info: 0x%016llx\n", newpfx, + err_info->error_info); + if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR) + printk("%svirtual fault address: 0x%016llx\n", + newpfx, err_info->virt_fault_addr); + if (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSIC
[PATCH V17 02/11] ras: acpi/apei: cper: add support for generic data v3 structure
The ACPI 6.1 spec adds a new revision of the generic error data entry structure. Add support to handle the new structure as well as properly verify and iterate through the generic data entries. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang --- drivers/acpi/apei/ghes.c| 11 +-- drivers/firmware/efi/cper.c | 37 ++--- include/acpi/ghes.h | 36 3 files changed, 63 insertions(+), 21 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index c072acf..626552e 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -428,8 +428,7 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int unsigned long pfn; int flags = -1; int sec_sev = ghes_severity(gdata->error_severity); - struct cper_sec_mem_err *mem_err; - mem_err = (struct cper_sec_mem_err *)(gdata + 1); + struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata); if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) return; @@ -465,8 +464,8 @@ static void ghes_do_proc(struct ghes *ghes, sec_sev = ghes_severity(gdata->error_severity); if (!uuid_le_cmp(*(uuid_le *)gdata->section_type, CPER_SEC_PLATFORM_MEM)) { - struct cper_sec_mem_err *mem_err; - mem_err = (struct cper_sec_mem_err *)(gdata+1); + struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata); + ghes_edac_report_mem_error(ghes, sev, mem_err); arch_apei_report_mem_error(sev, mem_err); @@ -475,8 +474,8 @@ static void ghes_do_proc(struct ghes *ghes, #ifdef CONFIG_ACPI_APEI_PCIEAER else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type, CPER_SEC_PCIE)) { - struct cper_sec_pcie *pcie_err; - pcie_err = (struct cper_sec_pcie *)(gdata+1); + struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); + if (sev == GHES_SEV_RECOVERABLE && sec_sev == GHES_SEV_RECOVERABLE && pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID && diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index d425374..9024757 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -32,6 +32,7 @@ #include #include #include +#include #define INDENT_SP " " @@ -386,8 +387,9 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie, pfx, pcie->bridge.secondary_status, pcie->bridge.control); } -static void cper_estatus_print_section( - const char *pfx, const struct acpi_hest_generic_data *gdata, int sec_no) +static void +cper_estatus_print_section(const char *pfx, struct acpi_hest_generic_data *gdata, + int sec_no) { uuid_le *sec_type = (uuid_le *)gdata->section_type; __u16 severity; @@ -403,14 +405,16 @@ static void cper_estatus_print_section( snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP); if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_GENERIC)) { - struct cper_sec_proc_generic *proc_err = (void *)(gdata + 1); + struct cper_sec_proc_generic *proc_err = acpi_hest_get_payload(gdata); + printk("%s""section_type: general processor error\n", newpfx); if (gdata->error_data_length >= sizeof(*proc_err)) cper_print_proc_generic(newpfx, proc_err); else goto err_section_too_small; } else if (!uuid_le_cmp(*sec_type, CPER_SEC_PLATFORM_MEM)) { - struct cper_sec_mem_err *mem_err = (void *)(gdata + 1); + struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata); + printk("%s""section_type: memory error\n", newpfx); if (gdata->error_data_length >= sizeof(struct cper_sec_mem_err_old)) @@ -419,7 +423,8 @@ static void cper_estatus_print_section( else goto err_section_too_small; } else if (!uuid_le_cmp(*sec_type, CPER_SEC_PCIE)) { - struct cper_sec_pcie *pcie = (void *)(gdata + 1); + struct cper_sec_pcie *pcie = acpi_hest_get_payload(gdata); + printk("%s""section_type: PCIe error\n", newpfx); if (gdata->error_data_length >= sizeof(*pcie)) cper_print_pcie(newpfx, pcie, gdata); @@ -438,7 +443,7 @@ void cper_estatus_print(const char *pfx, const struct acpi_hest_generic_status *estatus) { struct acpi_hest_generic_data *gdata; - unsigned int data_len, gedata_len; + unsigned int data_len
[PATCH V17 06/11] acpi: apei: handle SEA notification type for ARMv8
ARM APEI extension proposal added SEA (Synchronous External Abort) notification type for ARMv8. Add a new GHES error source handling function for SEA. If an error source's notification type is SEA, then this function can be registered into the SEA exception handler. That way GHES will parse and report SEA exceptions when they occur. An SEA can interrupt code that had interrupts masked and is treated as an NMI. To aid this the page of address space for mapping APEI buffers while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is changed to use the helper methods to find the prot_t to map with in the same way as ghes_ioremap_pfn_irq(). Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Reviewed-by: James Morse Acked-by: Catalin Marinas --- arch/arm64/Kconfig| 2 ++ arch/arm64/mm/fault.c | 17 drivers/acpi/apei/Kconfig | 15 ++ drivers/acpi/apei/ghes.c | 70 +++ include/acpi/ghes.h | 7 + 5 files changed, 105 insertions(+), 6 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3dcd7ec..8055997 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -19,6 +19,7 @@ config ARM64 select ARCH_HAS_STRICT_KERNEL_RWX select ARCH_HAS_STRICT_MODULE_RWX select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST + select ARCH_HAVE_NMI_SAFE_CMPXCHG if ACPI_APEI_SEA select ARCH_USE_CMPXCHG_LOCKREF select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_NUMA_BALANCING @@ -92,6 +93,7 @@ config ARM64 select HAVE_IRQ_TIME_ACCOUNTING select HAVE_MEMBLOCK select HAVE_MEMBLOCK_NODE_MAP if NUMA + select HAVE_NMI if ACPI_APEI_SEA select HAVE_PATA_PLATFORM select HAVE_PERF_EVENTS select HAVE_PERF_REGS diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 6697816..9b4e26f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -42,6 +42,8 @@ #include #include +#include + struct fault_info { int (*fn)(unsigned long addr, unsigned int esr, struct pt_regs *regs); @@ -535,6 +537,21 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n", inf->name, esr, addr); + /* +* Synchronous aborts may interrupt code which had interrupts masked. +* Before calling out into the wider kernel tell the interested +* subsystems. +*/ + if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) { + if (interrupts_enabled(regs)) + nmi_enter(); + + ghes_notify_sea(); + + if (interrupts_enabled(regs)) + nmi_exit(); + } + info.si_signo = SIGBUS; info.si_errno = 0; info.si_code = 0; diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig index b0140c8..de14d49 100644 --- a/drivers/acpi/apei/Kconfig +++ b/drivers/acpi/apei/Kconfig @@ -39,6 +39,21 @@ config ACPI_APEI_PCIEAER PCIe AER errors may be reported via APEI firmware first mode. Turn on this option to enable the corresponding support. +config ACPI_APEI_SEA + bool "APEI Synchronous External Abort logging/recovering support" + depends on ARM64 && ACPI_APEI_GHES + default y + help + This option should be enabled if the system supports + firmware first handling of SEA (Synchronous External Abort). + SEA happens with certain faults of data abort or instruction + abort synchronous exceptions on ARMv8 systems. If a system + supports firmware first handling of SEA, the platform analyzes + and handles hardware error notifications from SEA, and it may then + form a HW error record for the OS to parse and handle. This + option allows the OS to look for such hardware error record, and + take appropriate action. + config ACPI_APEI_MEMORY_FAILURE bool "APEI memory error recovering support" depends on ACPI_APEI && MEMORY_FAILURE diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 626552e..7e59a73 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -116,11 +116,7 @@ static inline bool is_hest_type_generic_v2(struct ghes *ghes) * Two virtual pages are used, one for IRQ/PROCESS context, the other for * NMI context (optionally). */ -#ifdef CONFIG_HAVE_ACPI_APEI_NMI #define GHES_IOREMAP_PAGES 2 -#else -#define GHES_IOREMAP_PAGES 1 -#endif #define GHES_IOREMAP_IRQ_PAGE(base)(base) #define GHES_IOREMAP_NMI_PAGE(base)((base) + PAGE_SIZE) @@ -159,10 +155,14 @@ static void ghes_ioremap_exit(void) static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn) { unsigned long vaddr; + phys_addr_t paddr; + pgprot_t prot; vaddr = (unsigned lo
[PATCH V17 05/11] arm64: exception: handle Synchronous External Abort
SEA exceptions are often caused by an uncorrected hardware error, and are handled when data abort and instruction abort exception classes have specific values for their Fault Status Code. When SEA occurs, before killing the process, report the error in the kernel logs. Update fault_info[] with specific SEA faults so that the new SEA handler is used. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Reviewed-by: James Morse Acked-by: Catalin Marinas --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/mm/fault.c| 45 ++-- 2 files changed, 36 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 85997c0..28bf02e 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -83,6 +83,7 @@ #define ESR_ELx_WNR(UL(1) << 6) /* Shared ISS field definitions for Data/Instruction aborts */ +#define ESR_ELx_FnV(UL(1) << 10) #define ESR_ELx_EA (UL(1) << 9) #define ESR_ELx_S1PTW (UL(1) << 7) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 37b95df..6697816 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -522,6 +522,31 @@ static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs) return 1; } +/* + * This abort handler deals with Synchronous External Abort. + * It calls notifiers, and then returns "fault". + */ +static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) +{ + struct siginfo info; + const struct fault_info *inf; + + inf = esr_to_fault_info(esr); + pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n", + inf->name, esr, addr); + + info.si_signo = SIGBUS; + info.si_errno = 0; + info.si_code = 0; + if (esr & ESR_ELx_FnV) + info.si_addr = 0; + else + info.si_addr = (void __user *)addr; + arm64_notify_die("", regs, &info, esr); + + return 0; +} + static const struct fault_info fault_info[] = { { do_bad, SIGBUS, 0, "ttbr address size fault" }, { do_bad, SIGBUS, 0, "level 1 address size fault"}, @@ -539,22 +564,22 @@ static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs) { do_page_fault,SIGSEGV, SEGV_ACCERR, "level 1 permission fault" }, { do_page_fault,SIGSEGV, SEGV_ACCERR, "level 2 permission fault" }, { do_page_fault,SIGSEGV, SEGV_ACCERR, "level 3 permission fault" }, - { do_bad, SIGBUS, 0, "synchronous external abort"}, + { do_sea, SIGBUS, 0, "synchronous external abort"}, { do_bad, SIGBUS, 0, "unknown 17" }, { do_bad, SIGBUS, 0, "unknown 18" }, { do_bad, SIGBUS, 0, "unknown 19" }, - { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous external abort (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous parity error" }, + { do_sea, SIGBUS, 0, "level 0 (translation table walk)" }, + { do_sea, SIGBUS, 0, "level 1 (translation table walk)" }, + { do_sea, SIGBUS, 0, "level 2 (translation table walk)" }, + { do_sea, SIGBUS, 0, "level 3 (translation table walk)" }, + { do_sea, SIGBUS, 0, "synchronous parity or ECC error" }, { do_bad, SIGBUS, 0, "unknown 25" }, { do_bad, SIGBUS, 0, "unknown 26" }, { do_bad, SIGBUS, 0, "unknown 27" }, - { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk)" }, - { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk)" }, + { do_sea, SIGBUS, 0, "level 0 synchronous parity error (translatio
[PATCH V17 09/11] ras: acpi / apei: generate trace event for unrecognized CPER section
The UEFI spec includes non-standard section type support in the Common Platform Error Record. This is defined in section N.2.3 of UEFI version 2.5. Currently if the CPER section's type (UUID) does not match any section type that the kernel knows how to parse, a trace event is not generated. Generate a trace event which contains the raw error data for non-standard section type error records. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Tested-by: Shiju Jose --- drivers/acpi/apei/ghes.c | 27 +++ drivers/ras/ras.c | 10 +- include/linux/ras.h | 12 include/ras/ras_event.h | 45 + include/uapi/linux/uuid.h | 6 -- 5 files changed, 93 insertions(+), 7 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index fadfb43..16adbc8 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -45,11 +45,14 @@ #include #include #include +#include +#include #include #include #include #include +#include #include "apei-internal.h" @@ -460,12 +463,22 @@ static void ghes_do_proc(struct ghes *ghes, { int sev, sec_sev; struct acpi_hest_generic_data *gdata; + uuid_le sec_type; + uuid_le *fru_id = &NULL_UUID_LE; + char *fru_text = ""; sev = ghes_severity(estatus->error_severity); apei_estatus_for_each_section(estatus, gdata) { sec_sev = ghes_severity(gdata->error_severity); - if (!uuid_le_cmp(*(uuid_le *)gdata->section_type, -CPER_SEC_PLATFORM_MEM)) { + sec_type = *(uuid_le *)gdata->section_type; + + if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID) + fru_id = (uuid_le *)gdata->fru_id; + + if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT) + fru_text = gdata->fru_text; + + if (!uuid_le_cmp(sec_type, CPER_SEC_PLATFORM_MEM)) { struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata); ghes_edac_report_mem_error(ghes, sev, mem_err); @@ -474,8 +487,7 @@ static void ghes_do_proc(struct ghes *ghes, ghes_handle_memory_failure(gdata, sev); } #ifdef CONFIG_ACPI_APEI_PCIEAER - else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type, - CPER_SEC_PCIE)) { + else if (!uuid_le_cmp(sec_type, CPER_SEC_PCIE)) { struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); if (sev == GHES_SEV_RECOVERABLE && @@ -506,6 +518,13 @@ static void ghes_do_proc(struct ghes *ghes, } #endif + else { + void *err = acpi_hest_get_payload(gdata); + + log_non_standard_event(&sec_type, fru_id, fru_text, + sec_sev, err, + gdata->error_data_length); + } } } diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c index 94f8038..e87fd9e 100644 --- a/drivers/ras/ras.c +++ b/drivers/ras/ras.c @@ -7,11 +7,19 @@ #include #include +#include #define CREATE_TRACE_POINTS #define TRACE_INCLUDE_PATH ../../include/ras #include +void log_non_standard_event(const uuid_le *sec_type, const uuid_le *fru_id, + const char *fru_text, const u8 sev, const u8 *err, + const u32 len) +{ + trace_non_standard_event(sec_type, fru_id, fru_text, sev, err, len); +} + static int __init ras_init(void) { int rc = 0; @@ -27,7 +35,7 @@ static int __init ras_init(void) EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event); #endif EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); - +EXPORT_TRACEPOINT_SYMBOL_GPL(non_standard_event); int __init parse_ras_param(char *str) { diff --git a/include/linux/ras.h b/include/linux/ras.h index ffb1471..a7f3ed3 100644 --- a/include/linux/ras.h +++ b/include/linux/ras.h @@ -2,6 +2,7 @@ #define __RAS_H__ #include +#include #ifdef CONFIG_DEBUG_FS int ras_userspace_consumers(void); @@ -22,4 +23,15 @@ static inline void __init cec_init(void) { } static inline int cec_add_elem(u64 pfn){ return -ENODEV; } #endif +#ifdef CONFIG_RAS +void log_non_standard_event(const uuid_le *sec_type, + const uuid_le *fru_id, const char *fru_text, + const u8 sev, const u8 *err, const u32 len); +#else +static void log_non_standard_event(const uuid_le *sec_type, + const uuid_le *fru_id, const char *fru_text, + const u8 sev, const u8 *err, + const u32 len) { return; } +#endif + #endif /* __RAS_H__ */ diff --git a/include/ras/ras_event.h b/inclu
[PATCH V17 08/11] efi: print unrecognized CPER section
UEFI spec allows for non-standard section in Common Platform Error Record. This is defined in section N.2.3 of UEFI version 2.5. Currently if the CPER section's type (UUID) does not match with one of the section types that the kernel knows how to parse, the section is skipped. Therefore, user is not able to see such CPER data, for instance, error record of non-standard section. This change prints out the raw data in hex in the dmesg buffer so that non-standard sections are reported to the user. Non-standard section type errors should be reported to the user because these can include errors which are vendor specific. The data length is taken from Error Data length field of Generic Error Data Entry. The following is a sample output from dmesg: Hardware error from APEI Generic Hardware Error Source: 2 It has been corrected by h/w and requires no further action event severity: corrected time: precise 2017-03-15 20:37:35 Error 0, type: corrected section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b section length: 0x238 : 4d415201 4d492031 453a4d45 435f4343 .RAM1 IMEM:ECC_C 0010: 53515f45 44525f42 E_QSB_RD 0020: 0030: 0101 0101 0040: 0005 0050: 0101 0001 0000 ... The raw data from the error can then be decoded using vendor specific tools. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Reviewed-by: James Morse --- drivers/firmware/efi/cper.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index eac0854..d5a5855 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -585,8 +585,15 @@ static void cper_print_tstamp(const char *pfx, else goto err_section_too_small; #endif - } else - printk("%s""section type: unknown, %pUl\n", newpfx, sec_type); + } else { + const void *err = acpi_hest_get_payload(gdata); + + printk("%ssection type: unknown, %pUl\n", newpfx, sec_type); + printk("%ssection length: %#x\n", newpfx, + gdata->error_data_length); + print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, err, + gdata->error_data_length, true); + } return; -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V17 07/11] acpi: apei: panic OS with fatal error status block
From: "Jonathan (Zhixiong) Zhang" Even if an error status block's severity is fatal, the kernel does not honor the severity level and panic. With the firmware first model, the platform could inform the OS about a fatal hardware error through the non-NMI GHES notification type. The OS should panic when a hardware error record is received with this severity. Call panic() after CPER data in error status block is printed if severity is fatal, before each error section is handled. Signed-off-by: Jonathan (Zhixiong) Zhang Signed-off-by: Tyler Baicar Reviewed-by: James Morse --- drivers/acpi/apei/ghes.c | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 7e59a73..fadfb43 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -135,6 +135,8 @@ static inline bool is_hest_type_generic_v2(struct ghes *ghes) static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE]; static atomic_t ghes_estatus_cache_alloced; +static int ghes_panic_timeout __read_mostly = 30; + static int ghes_ioremap_init(void) { ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES, @@ -691,6 +693,16 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2) return apei_write(val, &gv2->read_ack_register); } +static void __ghes_panic(struct ghes *ghes) +{ + __ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus); + + /* reboot to log the error! */ + if (!panic_timeout) + panic_timeout = ghes_panic_timeout; + panic("Fatal hardware error!"); +} + static int ghes_proc(struct ghes *ghes) { int rc; @@ -698,6 +710,11 @@ static int ghes_proc(struct ghes *ghes) rc = ghes_read_estatus(ghes, 0); if (rc) goto out; + + if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) { + __ghes_panic(ghes); + } + if (!ghes_estatus_cached(ghes->estatus)) { if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus)) ghes_estatus_cache_add(ghes->generic, ghes->estatus); @@ -838,8 +855,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) static LIST_HEAD(ghes_nmi); -static int ghes_panic_timeout __read_mostly = 30; - static void ghes_proc_in_irq(struct irq_work *irq_work) { struct llist_node *llnode, *next; @@ -925,18 +940,6 @@ static void __process_error(struct ghes *ghes) #endif } -static void __ghes_panic(struct ghes *ghes) -{ - oops_begin(); - ghes_print_queued_estatus(); - __ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus); - - /* reboot to log the error! */ - if (panic_timeout == 0) - panic_timeout = ghes_panic_timeout; - panic("Fatal hardware error!"); -} - static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs) { struct ghes *ghes; @@ -954,8 +957,11 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs) } sev = ghes_severity(ghes->estatus->error_severity); - if (sev >= GHES_SEV_PANIC) + if (sev >= GHES_SEV_PANIC) { + oops_begin(); + ghes_print_queued_estatus(); __ghes_panic(ghes); + } if (!(ghes->flags & GHES_TO_CLEAR)) continue; -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V17 11/11] arm/arm64: KVM: add guest SEA support
Currently external aborts are unsupported by the guest abort handling. Add handling for SEAs so that the host kernel reports SEAs which occur in the guest kernel. When an SEA occurs in the guest kernel, the guest exits and is routed to kvm_handle_guest_abort(). Prior to this patch, a print message of an unsupported FSC would be printed and nothing else would happen. With this patch, the code gets routed to the APEI handling of SEAs in the host kernel to report the SEA information. Signed-off-by: Tyler Baicar Acked-by: Catalin Marinas Acked-by: Marc Zyngier Acked-by: Christoffer Dall --- arch/arm/include/asm/kvm_arm.h | 10 ++ arch/arm/include/asm/system_misc.h | 5 + arch/arm64/include/asm/kvm_arm.h | 10 ++ arch/arm64/include/asm/system_misc.h | 2 ++ arch/arm64/mm/fault.c| 22 -- drivers/acpi/apei/ghes.c | 17 +++-- include/acpi/ghes.h | 2 +- virt/kvm/arm/mmu.c | 36 +--- 8 files changed, 92 insertions(+), 12 deletions(-) diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h index a3f0b3d..ebf020b 100644 --- a/arch/arm/include/asm/kvm_arm.h +++ b/arch/arm/include/asm/kvm_arm.h @@ -187,6 +187,16 @@ #define FSC_FAULT (0x04) #define FSC_ACCESS (0x08) #define FSC_PERM (0x0c) +#define FSC_SEA(0x10) +#define FSC_SEA_TTW0 (0x14) +#define FSC_SEA_TTW1 (0x15) +#define FSC_SEA_TTW2 (0x16) +#define FSC_SEA_TTW3 (0x17) +#define FSC_SECC (0x18) +#define FSC_SECC_TTW0 (0x1c) +#define FSC_SECC_TTW1 (0x1d) +#define FSC_SECC_TTW2 (0x1e) +#define FSC_SECC_TTW3 (0x1f) /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */ #define HPFAR_MASK (~0xf) diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h index a3d61ad..8c4a89f 100644 --- a/arch/arm/include/asm/system_misc.h +++ b/arch/arm/include/asm/system_misc.h @@ -22,6 +22,11 @@ extern unsigned int user_debug; +static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr) +{ + return -1; +} + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_ARM_SYSTEM_MISC_H */ diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 6e99978..61d694c 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -204,6 +204,16 @@ #define FSC_FAULT ESR_ELx_FSC_FAULT #define FSC_ACCESS ESR_ELx_FSC_ACCESS #define FSC_PERM ESR_ELx_FSC_PERM +#define FSC_SEAESR_ELx_FSC_EXTABT +#define FSC_SEA_TTW0 (0x14) +#define FSC_SEA_TTW1 (0x15) +#define FSC_SEA_TTW2 (0x16) +#define FSC_SEA_TTW3 (0x17) +#define FSC_SECC (0x18) +#define FSC_SECC_TTW0 (0x1c) +#define FSC_SECC_TTW1 (0x1d) +#define FSC_SECC_TTW2 (0x1e) +#define FSC_SECC_TTW3 (0x1f) /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */ #define HPFAR_MASK (~UL(0xf)) diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h index bc81243..95aa442 100644 --- a/arch/arm64/include/asm/system_misc.h +++ b/arch/arm64/include/asm/system_misc.h @@ -56,6 +56,8 @@ void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int, __show_ratelimited; \ }) +int handle_guest_sea(phys_addr_t addr, unsigned int esr); + #endif /* __ASSEMBLY__ */ #endif /* __ASM_SYSTEM_MISC_H */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 9b4e26f..1ce62cc 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -532,6 +532,7 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) { struct siginfo info; const struct fault_info *inf; + int ret = 0; inf = esr_to_fault_info(esr); pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n", @@ -546,7 +547,7 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) if (interrupts_enabled(regs)) nmi_enter(); - ghes_notify_sea(); + ret = ghes_notify_sea(); if (interrupts_enabled(regs)) nmi_exit(); @@ -561,7 +562,7 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) info.si_addr = (void __user *)addr; arm64_notify_die("", regs, &info, esr); - return 0; + return ret; } static const struct fault_info fault_info[] = { @@ -632,6 +633,23 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) }; /* + * Handle Synchronous External Aborts that occur in a guest kernel. + * + * The return value will be zero if the SEA was successfully handled + * and non-zero if there was an error processing the error or there was + * no error to process. + */ +int handle_guest_sea(phys_addr_t addr, unsigned int esr) +{ +
[PATCH V17 10/11] trace, ras: add ARM processor error trace event
Currently there are trace events for the various RAS errors with the exception of ARM processor type errors. Add a new trace event for such errors so that the user will know when they occur. These trace events are consistent with the ARM processor error section type defined in UEFI 2.6 spec section N.2.4.4. Signed-off-by: Tyler Baicar Acked-by: Steven Rostedt Reviewed-by: Xie XiuQi --- drivers/acpi/apei/ghes.c| 6 +- drivers/firmware/efi/cper.c | 1 + drivers/ras/ras.c | 6 ++ include/linux/ras.h | 3 +++ include/ras/ras_event.h | 45 + 5 files changed, 60 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 16adbc8..a0ab5f3 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -518,7 +518,11 @@ static void ghes_do_proc(struct ghes *ghes, } #endif - else { + else if (!uuid_le_cmp(sec_type, CPER_SEC_PROC_ARM)) { + struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); + + log_arm_hw_error(err); + } else { void *err = acpi_hest_get_payload(gdata); log_non_standard_event(&sec_type, fru_id, fru_text, diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index d5a5855..48a8f69 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -35,6 +35,7 @@ #include #include #include +#include #define INDENT_SP " " diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c index e87fd9e..39701a5 100644 --- a/drivers/ras/ras.c +++ b/drivers/ras/ras.c @@ -20,6 +20,11 @@ void log_non_standard_event(const uuid_le *sec_type, const uuid_le *fru_id, trace_non_standard_event(sec_type, fru_id, fru_text, sev, err, len); } +void log_arm_hw_error(struct cper_sec_proc_arm *err) +{ + trace_arm_event(err); +} + static int __init ras_init(void) { int rc = 0; @@ -36,6 +41,7 @@ static int __init ras_init(void) #endif EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); EXPORT_TRACEPOINT_SYMBOL_GPL(non_standard_event); +EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event); int __init parse_ras_param(char *str) { diff --git a/include/linux/ras.h b/include/linux/ras.h index a7f3ed3..7a14658 100644 --- a/include/linux/ras.h +++ b/include/linux/ras.h @@ -3,6 +3,7 @@ #include #include +#include #ifdef CONFIG_DEBUG_FS int ras_userspace_consumers(void); @@ -27,11 +28,13 @@ static inline void __init cec_init(void){ } void log_non_standard_event(const uuid_le *sec_type, const uuid_le *fru_id, const char *fru_text, const u8 sev, const u8 *err, const u32 len); +void log_arm_hw_error(struct cper_sec_proc_arm *err); #else static void log_non_standard_event(const uuid_le *sec_type, const uuid_le *fru_id, const char *fru_text, const u8 sev, const u8 *err, const u32 len) { return; } +static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } #endif #endif /* __RAS_H__ */ diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index 4f79ba9..429f46f 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -162,6 +162,51 @@ ); /* + * ARM Processor Events Report + * + * This event is generated when hardware detects an ARM processor error + * has occurred. UEFI 2.6 spec section N.2.4.4. + */ +TRACE_EVENT(arm_event, + + TP_PROTO(const struct cper_sec_proc_arm *proc), + + TP_ARGS(proc), + + TP_STRUCT__entry( + __field(u64, mpidr) + __field(u64, midr) + __field(u32, running_state) + __field(u32, psci_state) + __field(u8, affinity) + ), + + TP_fast_assign( + if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL) + __entry->affinity = proc->affinity_level; + else + __entry->affinity = ~0; + if (proc->validation_bits & CPER_ARM_VALID_MPIDR) + __entry->mpidr = proc->mpidr; + else + __entry->mpidr = 0ULL; + __entry->midr = proc->midr; + if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) { + __entry->running_state = proc->running_state; + __entry->psci_state = proc->psci_state; + } else { + __entry->running_state = ~0; + __entry->psci_state = ~0; + } + ), + + TP_printk("affinity level: %d; MPIDR: %016llx; MIDR: %016llx; " + "running state: %d; PSCI state: %d", + __entry->affinity, __entry->mpidr, __entry->midr, + __entry->running_state,
[PATCH V17 03/11] cper: add timestamp print to CPER status printing
The ACPI 6.1 spec added a timestamp to the generic error data entry structure. Print the timestamp out when printing out the error information. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang --- drivers/firmware/efi/cper.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index 9024757..229cf92 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include #define INDENT_SP " " @@ -387,6 +389,27 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie, pfx, pcie->bridge.secondary_status, pcie->bridge.control); } +static void cper_print_tstamp(const char *pfx, + struct acpi_hest_generic_data_v300 *gdata) +{ + __u8 hour, min, sec, day, mon, year, century, *timestamp; + + if (gdata->validation_bits & ACPI_HEST_GEN_VALID_TIMESTAMP) { + timestamp = (__u8 *)&(gdata->time_stamp); + sec = bcd2bin(timestamp[0]); + min = bcd2bin(timestamp[1]); + hour = bcd2bin(timestamp[2]); + day = bcd2bin(timestamp[4]); + mon = bcd2bin(timestamp[5]); + year = bcd2bin(timestamp[6]); + century = bcd2bin(timestamp[7]); + + printk("%s%ststamp: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx, + (timestamp[3] & 0x1 ? "precise " : "imprecise "), + century, year, mon, day, hour, min, sec); + } +} + static void cper_estatus_print_section(const char *pfx, struct acpi_hest_generic_data *gdata, int sec_no) @@ -395,6 +418,9 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie, __u16 severity; char newpfx[64]; + if (acpi_hest_get_version(gdata) >= 3) + cper_print_tstamp(pfx, (struct acpi_hest_generic_data_v300 *)gdata); + severity = gdata->error_severity; printk("%s""Error %d, type: %s\n", pfx, sec_no, cper_severity_str(severity)); -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V17 01/11] acpi: apei: read ack upon ghes record consumption
A RAS (Reliability, Availability, Serviceability) controller may be a separate processor running in parallel with OS execution, and may generate error records for consumption by the OS. If the RAS controller produces multiple error records, then they may be overwritten before the OS has consumed them. The Generic Hardware Error Source (GHES) v2 structure introduces the capability for the OS to acknowledge the consumption of the error record generated by the RAS controller. A RAS controller supporting GHESv2 shall wait for the acknowledgment before writing a new error record, thus eliminating the race condition. Add support for parsing of GHESv2 sub-tables as well. Signed-off-by: Tyler Baicar CC: Jonathan (Zhixiong) Zhang Reviewed-by: James Morse --- drivers/acpi/apei/ghes.c | 59 +--- drivers/acpi/apei/hest.c | 7 -- include/acpi/ghes.h | 5 +++- 3 files changed, 65 insertions(+), 6 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index d0855c0..c072acf 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -46,6 +46,7 @@ #include #include +#include #include #include #include @@ -80,6 +81,11 @@ ((struct acpi_hest_generic_status *)\ ((struct ghes_estatus_node *)(estatus_node) + 1)) +static inline bool is_hest_type_generic_v2(struct ghes *ghes) +{ + return ghes->generic->header.type == ACPI_HEST_TYPE_GENERIC_ERROR_V2; +} + /* * This driver isn't really modular, however for the time being, * continuing to use module_param is the easiest way to remain @@ -240,6 +246,16 @@ static int ghes_estatus_pool_expand(unsigned long len) return 0; } +static int map_gen_v2(struct ghes *ghes) +{ + return apei_map_generic_address(&ghes->generic_v2->read_ack_register); +} + +static void unmap_gen_v2(struct ghes *ghes) +{ + apei_unmap_generic_address(&ghes->generic_v2->read_ack_register); +} + static struct ghes *ghes_new(struct acpi_hest_generic *generic) { struct ghes *ghes; @@ -249,10 +265,17 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) ghes = kzalloc(sizeof(*ghes), GFP_KERNEL); if (!ghes) return ERR_PTR(-ENOMEM); + ghes->generic = generic; + if (is_hest_type_generic_v2(ghes)) { + rc = map_gen_v2(ghes); + if (rc) + goto err_free; + } + rc = apei_map_generic_address(&generic->error_status_address); if (rc) - goto err_free; + goto err_unmap_read_ack_addr; error_block_length = generic->error_block_length; if (error_block_length > GHES_ESTATUS_MAX_SIZE) { pr_warning(FW_WARN GHES_PFX @@ -264,13 +287,16 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) ghes->estatus = kmalloc(error_block_length, GFP_KERNEL); if (!ghes->estatus) { rc = -ENOMEM; - goto err_unmap; + goto err_unmap_status_addr; } return ghes; -err_unmap: +err_unmap_status_addr: apei_unmap_generic_address(&generic->error_status_address); +err_unmap_read_ack_addr: + if (is_hest_type_generic_v2(ghes)) + unmap_gen_v2(ghes); err_free: kfree(ghes); return ERR_PTR(rc); @@ -280,6 +306,8 @@ static void ghes_fini(struct ghes *ghes) { kfree(ghes->estatus); apei_unmap_generic_address(&ghes->generic->error_status_address); + if (is_hest_type_generic_v2(ghes)) + unmap_gen_v2(ghes); } static inline int ghes_severity(int severity) @@ -649,6 +677,21 @@ static void ghes_estatus_cache_add( rcu_read_unlock(); } +static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2) +{ + int rc; + u64 val = 0; + + rc = apei_read(&val, &gv2->read_ack_register); + if (rc) + return rc; + + val &= gv2->read_ack_preserve << gv2->read_ack_register.bit_offset; + val |= gv2->read_ack_write<< gv2->read_ack_register.bit_offset; + + return apei_write(val, &gv2->read_ack_register); +} + static int ghes_proc(struct ghes *ghes) { int rc; @@ -661,6 +704,16 @@ static int ghes_proc(struct ghes *ghes) ghes_estatus_cache_add(ghes->generic, ghes->estatus); } ghes_do_proc(ghes, ghes->estatus); + + /* +* GHESv2 type HEST entries introduce support for error acknowledgment, +* so only acknowledge the error if this support is present. +*/ + if (is_hest_type_generic_v2(ghes)) { + rc = ghes_ack_error(ghes->generic_v2); + if (rc) + return rc; + } out: ghes_clear_estatus(ghes); return rc; diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c index 8f2a98e..456b488 100644 --- a/drivers/acpi/apei/hest.c ++
[PATCH V17 00/11] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64
When a memory error, CPU error, PCIe error, or other type of hardware error that's covered by RAS occurs, firmware should populate the shared GHES memory location with the proper GHES structures to notify the OS of the error. For example, platforms that implement firmware first handling may implement separate GHES sources for corrected errors and uncorrected errors. If the error is an uncorrectable error, then the firmware will notify the OS immediately since the error needs to be handled ASAP. The OS will then be able to take the appropriate action needed such as offlining a page. If the error is a corrected error, then the firmware will not interrupt the OS immediately. Instead, the OS will see and report the error the next time it's GHES timer expires. The kernel will first parse the GHES structures and report the errors through the kernel logs and then notify the user space through RAS trace events. This allows user space applications such as RAS Daemon to see the errors and report them however the user desires. This patchset extends the kernel functionality for RAS errors based on updates in the UEFI 2.6 and ACPI 6.1 specifications. An example flow from firmware to user space could be: +---+ +>| | | | GHES polling |--+ +-+ |source | | +---+ ++ | | +---+ | | Kernel GHES | || | Firmware | +-->| CPER AER and |-->| RAS trace | | | +---+ | | EDAC drivers | | event| +-+ | | | +---+ ++ | | GHES sci |--+ +>| source | +---+ Add support for Generic Hardware Error Source (GHES) v2, which introduces the capability for the OS to acknowledge the consumption of the error record generated by the Reliability, Availability and Serviceability (RAS) controller. This eliminates potential race conditions between the OS and the RAS controller. Add support for the timestamp field added to the Generic Error Data Entry v3, allowing the OS to log the time that the error is generated by the firmware, rather than the time the error is consumed. This improves the correctness of event sequences when analyzing error logs. The timestamp is added in ACPI 6.1, reference Table 18-343 Generic Error Data Entry. Add support for ARMv8 Common Platform Error Record (CPER) per UEFI 2.6 specification. ARMv8 specific processor error information is reported as part of the CPER records. This provides more detail on for processor error logs. This can help describe ARMv8 cache, tlb, and bus errors. Synchronous External Abort (SEA) represents a specific processor error condition in ARM systems. A handler is added to recognize SEA errors, and a notifier is added to parse and report the errors before the process is killed. Refer to section N.2.1.1 in the Common Platform Error Record appendix of the UEFI 2.6 specification. Currently the kernel ignores CPER records that are unrecognized. On the other hand, UEFI spec allows for non-standard (eg. vendor proprietary) error section type in CPER (Common Platform Error Record), as defined in section N2.3 of UEFI version 2.5. Therefore, user is not able to see hardware error data of non-standard section. If section Type field of Generic Error Data Entry is unrecognized, prints out the raw data in dmesg buffer, and also adds a tracepoint for reporting such hardware errors. Currently even if an error status block's severity is fatal, the kernel does not honor the severity level and panic. With the firmware first model, the platform could inform the OS about a fatal hardware error through the non-NMI GHES notification type. The OS should panic when a hardware error record is received with this severity. Add support to handle SEAs that occur while a KVM guest kernel is running. Currently these are unsupported by the guest abort handling. V17:Rebase on tip Change trace event helper function names Remove unneeded prefixes from commit text V16:Rebase on 4.11 Change helper functions from #defines to inline functions Address checkpatch warnings which make sense Various parameter/variable name changes and spacing changes for better code readibility Comment why only GHESv2 needs to acknowledge the error records Define and set error structures on the same line Change timestamp printing function name to cper_print_tstamp Update timestamp to be a single print again and specify when it's precise or imprecise Only print section length when the length check fails in the ARM CPER record parsing Remove version and length print of ARM error info structures Spell out Multiprocessor Affinity Register (MPIDR) and Power State Coordination Interface (PSCI) Combine invalid context prints for ARM context info parsing
Re: [PATCH v5 32/32] x86/mm: Add support to make use of Secure Memory Encryption
On Fri, May 19, 2017 at 03:16:51PM -0500, Josh Poimboeuf wrote: > I'm the stack validation guy, not the stack protection guy :-) LOL. I thought you were *the* stacks guy. :-))) But once you've validated it, you could protect it then too. :-) -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 32/32] x86/mm: Add support to make use of Secure Memory Encryption
On Fri, May 19, 2017 at 01:30:05PM +0200, Borislav Petkov wrote: > > it is called so early. I can get past it by adding: > > > > CFLAGS_mem_encrypt.o := $(nostackp) > > > > in the arch/x86/mm/Makefile, but that obviously eliminates the support > > for the whole file. Would it be better to split out the sme_enable() > > and other boot routines into a separate file or just apply the > > $(nostackp) to the whole file? > > Josh might have a better idea here... CCed. I'm the stack validation guy, not the stack protection guy :-) But there is a way to disable compiler options on a per-function basis with the gcc __optimize__ function attribute. For example: __attribute__((__optimize__("no-stack-protector"))) -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 23/32] swiotlb: Add warnings for use of bounce buffers with SME
On 5/16/2017 9:52 AM, Borislav Petkov wrote: On Tue, Apr 18, 2017 at 04:20:19PM -0500, Tom Lendacky wrote: Add warnings to let the user know when bounce buffers are being used for DMA when SME is active. Since the bounce buffers are not in encrypted memory, these notifications are to allow the user to determine some appropriate action - if necessary. Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 11 +++ include/linux/dma-mapping.h| 11 +++ include/linux/mem_encrypt.h|6 ++ lib/swiotlb.c |3 +++ 4 files changed, 31 insertions(+) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 0637b4b..b406df2 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -26,6 +26,11 @@ static inline bool sme_active(void) return !!sme_me_mask; } +static inline u64 sme_dma_mask(void) +{ + return ((u64)sme_me_mask << 1) - 1; +} + void __init sme_early_encrypt(resource_size_t paddr, unsigned long size); void __init sme_early_decrypt(resource_size_t paddr, @@ -50,6 +55,12 @@ static inline bool sme_active(void) { return false; } + +static inline u64 sme_dma_mask(void) +{ + return 0ULL; +} + #endif static inline void __init sme_early_encrypt(resource_size_t paddr, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0977317..f825870 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -10,6 +10,7 @@ #include #include #include +#include /** * List of possible attributes associated with a DMA mapping. The semantics @@ -577,6 +578,11 @@ static inline int dma_set_mask(struct device *dev, u64 mask) if (!dev->dma_mask || !dma_supported(dev, mask)) return -EIO; + + if (sme_active() && (mask < sme_dma_mask())) + dev_warn_ratelimited(dev, +"SME is active, device will require DMA bounce buffers\n"); Bah, no need to break that line - just let it stick out. Ditto for the others. Ok. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 22/32] x86, swiotlb: DMA support for memory encryption
On 5/16/2017 9:27 AM, Borislav Petkov wrote: On Tue, Apr 18, 2017 at 04:20:10PM -0500, Tom Lendacky wrote: Since DMA addresses will effectively look like 48-bit addresses when the memory encryption mask is set, SWIOTLB is needed if the DMA mask of the device performing the DMA does not support 48-bits. SWIOTLB will be initialized to create decrypted bounce buffers for use by these devices. Use a verb in the subject: Subject: x86, swiotlb: Add memory encryption support or similar. Will do. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 19/32] x86/mm: Add support to access persistent memory in the clear
On 5/16/2017 9:04 AM, Borislav Petkov wrote: On Tue, Apr 18, 2017 at 04:19:42PM -0500, Tom Lendacky wrote: Persistent memory is expected to persist across reboots. The encryption key used by SME will change across reboots which will result in corrupted persistent memory. Persistent memory is handed out by block devices through memory remapping functions, so be sure not to map this memory as encrypted. Signed-off-by: Tom Lendacky --- arch/x86/mm/ioremap.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index bce0604..55317ba 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -425,17 +425,46 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) * Examine the physical address to determine if it is an area of memory * that should be mapped decrypted. If the memory is not part of the * kernel usable area it was accessed and created decrypted, so these - * areas should be mapped decrypted. + * areas should be mapped decrypted. And since the encryption key can + * change across reboots, persistent memory should also be mapped + * decrypted. */ static bool memremap_should_map_decrypted(resource_size_t phys_addr, unsigned long size) { + int is_pmem; + + /* +* Check if the address is part of a persistent memory region. +* This check covers areas added by E820, EFI and ACPI. +*/ + is_pmem = region_intersects(phys_addr, size, IORESOURCE_MEM, + IORES_DESC_PERSISTENT_MEMORY); + if (is_pmem != REGION_DISJOINT) + return true; + + /* +* Check if the non-volatile attribute is set for an EFI +* reserved area. +*/ + if (efi_enabled(EFI_BOOT)) { + switch (efi_mem_type(phys_addr)) { + case EFI_RESERVED_TYPE: + if (efi_mem_attributes(phys_addr) & EFI_MEMORY_NV) + return true; + break; + default: + break; + } + } + /* Check if the address is outside kernel usable area */ switch (e820__get_entry_type(phys_addr, phys_addr + size - 1)) { case E820_TYPE_RESERVED: case E820_TYPE_ACPI: case E820_TYPE_NVS: case E820_TYPE_UNUSABLE: + case E820_TYPE_PRAM: Can't you simply add: case E820_TYPE_PMEM: here too and thus get rid of the region_intersects() thing above? Because, for example, e820_type_to_iores_desc() maps E820_TYPE_PMEM to IORES_DESC_PERSISTENT_MEMORY so those should be equivalent... I'll have to double-check this, but I believe that when persistent memory is identified through the NFIT table it adds it as a resource but doesn't add it as an e820 entry so I can't rely on the type being returned as E820_TYPE_PMEM by e820__get_entry_type(). Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] efi-pstore: Fix write/erase id tracking
Prior to the pstore interface refactoring, the "id" generated during a backend pstore_write() was only retained by the internal pstore inode tracking list. Additionally the "part" was ignored, so EFI would encode this in the id. This corrects the misunderstandings and correctly sets "id" during pstore_write(), and uses "part" directly during pstore_erase(). Reported-by: Marta Lofstedt Fixes: 76cc9580e3fb ("pstore: Replace arguments for write() API") Fixes: a61072aae693 ("pstore: Replace arguments for erase() API") Signed-off-by: Kees Cook --- Since the pstore refactor broke this, I intend to push this via the pstore tree. --- drivers/firmware/efi/efi-pstore.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index 44148fd4c9f2..dda2e96328c0 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -53,6 +53,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry, if (sscanf(name, "dump-type%u-%u-%d-%lu-%c", &record->type, &part, &cnt, &time, &data_type) == 5) { record->id = generic_id(time, part, cnt); + record->part = part; record->count = cnt; record->time.tv_sec = time; record->time.tv_nsec = 0; @@ -64,6 +65,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry, } else if (sscanf(name, "dump-type%u-%u-%d-%lu", &record->type, &part, &cnt, &time) == 4) { record->id = generic_id(time, part, cnt); + record->part = part; record->count = cnt; record->time.tv_sec = time; record->time.tv_nsec = 0; @@ -77,6 +79,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry, * multiple logs, remains. */ record->id = generic_id(time, part, 0); + record->part = part; record->count = 0; record->time.tv_sec = time; record->time.tv_nsec = 0; @@ -241,9 +244,15 @@ static int efi_pstore_write(struct pstore_record *record) efi_guid_t vendor = LINUX_EFI_CRASH_GUID; int i, ret = 0; + record->time.tv_sec = get_seconds(); + record->time.tv_nsec = 0; + + record->id = generic_id(record->time.tv_sec, record->part, + record->count); + snprintf(name, sizeof(name), "dump-type%u-%u-%d-%lu-%c", record->type, record->part, record->count, -get_seconds(), record->compressed ? 'C' : 'D'); +record->time.tv_sec, record->compressed ? 'C' : 'D'); for (i = 0; i < DUMP_NAME_LEN; i++) efi_name[i] = name[i]; @@ -255,7 +264,6 @@ static int efi_pstore_write(struct pstore_record *record) if (record->reason == KMSG_DUMP_OOPS) efivar_run_worker(); - record->id = record->part; return ret; }; @@ -287,7 +295,7 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data) * holding multiple logs, remains. */ snprintf(name_old, sizeof(name_old), "dump-type%u-%u-%lu", - ed->record->type, (unsigned int)ed->record->id, + ed->record->type, ed->record->part, ed->record->time.tv_sec); for (i = 0; i < DUMP_NAME_LEN; i++) @@ -320,10 +328,7 @@ static int efi_pstore_erase(struct pstore_record *record) char name[DUMP_NAME_LEN]; efi_char16_t efi_name[DUMP_NAME_LEN]; int found, i; - unsigned int part; - do_div(record->id, 1000); - part = do_div(record->id, 100); snprintf(name, sizeof(name), "dump-type%u-%u-%d-%lu", record->type, record->part, record->count, record->time.tv_sec); -- 2.7.4 -- Kees Cook Pixel Security -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] efifb: Avoid reconfiguration of BAR that covers the framebuffer
Hi Bjorn, On 19 May 2017 at 17:27, Bjorn Helgaas wrote: > [+cc linux-pci] > > On Tue, Apr 04, 2017 at 04:27:44PM +0100, Ard Biesheuvel wrote: >> On UEFI systems, the PCI subsystem is enumerated by the firmware, >> and if a graphical framebuffer is exposed by a PCI device, its base >> address and size are exposed to the OS via the Graphics Output >> Protocol (GOP). >> >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >> scratch at boot. This may result in the GOP framebuffer address to >> become stale, if the BAR covering the framebuffer is modified. This >> will cause the framebuffer to become unresponsive, and may in some >> cases result in unpredictable behavior if the range is reassigned to >> another device. >> >> So add a non-x86 quirk to the EFI fb driver to find the BAR associated >> with the GOP base address, and claim the BAR resource so that the PCI >> core will not move it. > > I know this has already been merged as 55d728a40d36 ("efi/fb: Avoid > reconfiguration of BAR that covers the framebuffer"), and I'm not > suggesting that we revert it, but I have some misgivings. > Thanks for taking a look. I have been struggling with this issue for a while now. > One is the "#ifndef CONFIG_X86". In principle, there is nothing arch- > specific here. I don't think it's a good idea to build in dependencies on > things like "this arch preserves (or reprograms) PCI BARs". The PCI core > may reprogram all, some, or none of the PCI BARs, depending on what (if > anything) firmware has done. > IIRC this was a late addition. I agree that the issue exists in theory on x86 as well. However, I was primarily dealing with the reality of arm64 systems that suddenly explode in inexplicable ways after upgrading the kernel to one that happens to have EFIFB built in. The patch went into -stable as well, so I still think adding #ifndef CONFIG_X86 was the right choice here. > Another is the use of pci_claim_resource() to express the idea that "this > BAR can not be moved". We have IORESOURCE_PCI_FIXED for that purpose, and > previous versions of the patch used that. I understand there was some > problem with that, but I wish we could figure out and fix that problem > instead of using a different mechanism. > OK. The problem was that IORESOURCE_PCI_FIXED does not prevent the PCI subsystem from handing out the same range to another device. > I'm not even completely sold on the idea that we need to prevent the BAR > from being moved. I'm not a UEFI expert, but if this requirement is in the > spec, we should reference it. If not, it should be sufficient to remember > the boot-time BAR value, match the GOP base to *that*, and map it to > whatever the current BAR value is. > There is no such requirement in the spec. The graphics output protocol is not described in terms of PCI, BARs etc. The framebuffer is simply a memory range with side effects that is left enabled when handing over to the OS. In summary, I am as unhappy with the patch as you are, but it is still an improvement over the previous situation, so let's simply collaborate to get this into shape going forward. My preference would be to investigate IORESOURCE_PCI_FIXED again, because that still seems like the best way to deal with a live framebuffer on a PCI device that has memory decoding enabled. It should also address the issue with the current version of the patch, i.e., that claiming resources at this point is not possible if the device resides behind a bridge. So is there any guidance you can give as to how to proceed? If we set IORESOURCE_PCI_FIXED, how should be prevent the PCI layer from assigning this resource elsewhere if we cannot claim it yet? Regards, Ard. >> Fixes: 9822504c1fa5 ("efifb: Enable the efi-framebuffer platform driver ...") >> Cc: # v4.7+ >> Cc: Matt Fleming >> Cc: Peter Jones >> Signed-off-by: Ard Biesheuvel >> --- >> drivers/video/fbdev/efifb.c | 66 >> - >> 1 file changed, 65 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c >> index 8c4dc1e1f94f..758960b6aec9 100644 >> --- a/drivers/video/fbdev/efifb.c >> +++ b/drivers/video/fbdev/efifb.c >> @@ -10,6 +10,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -143,6 +144,8 @@ static struct attribute *efifb_attrs[] = { >> }; >> ATTRIBUTE_GROUPS(efifb); >> >> +static bool pci_dev_disabled;/* FB base matches BAR of a disabled >> device */ >> + >> static int efifb_probe(struct platform_device *dev) >> { >> struct fb_info *info; >> @@ -152,7 +155,7 @@ static int efifb_probe(struct platform_device *dev) >> unsigned int size_total; >> char *option = NULL; >> >> - if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) >> + if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) >> return -ENODEV; >> >> if (fb_get_options("efifb",
Re: [PATCH 2/2] efifb: Avoid reconfiguration of BAR that covers the framebuffer
[+cc linux-pci] On Tue, Apr 04, 2017 at 04:27:44PM +0100, Ard Biesheuvel wrote: > On UEFI systems, the PCI subsystem is enumerated by the firmware, > and if a graphical framebuffer is exposed by a PCI device, its base > address and size are exposed to the OS via the Graphics Output > Protocol (GOP). > > On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > scratch at boot. This may result in the GOP framebuffer address to > become stale, if the BAR covering the framebuffer is modified. This > will cause the framebuffer to become unresponsive, and may in some > cases result in unpredictable behavior if the range is reassigned to > another device. > > So add a non-x86 quirk to the EFI fb driver to find the BAR associated > with the GOP base address, and claim the BAR resource so that the PCI > core will not move it. I know this has already been merged as 55d728a40d36 ("efi/fb: Avoid reconfiguration of BAR that covers the framebuffer"), and I'm not suggesting that we revert it, but I have some misgivings. One is the "#ifndef CONFIG_X86". In principle, there is nothing arch- specific here. I don't think it's a good idea to build in dependencies on things like "this arch preserves (or reprograms) PCI BARs". The PCI core may reprogram all, some, or none of the PCI BARs, depending on what (if anything) firmware has done. Another is the use of pci_claim_resource() to express the idea that "this BAR can not be moved". We have IORESOURCE_PCI_FIXED for that purpose, and previous versions of the patch used that. I understand there was some problem with that, but I wish we could figure out and fix that problem instead of using a different mechanism. I'm not even completely sold on the idea that we need to prevent the BAR from being moved. I'm not a UEFI expert, but if this requirement is in the spec, we should reference it. If not, it should be sufficient to remember the boot-time BAR value, match the GOP base to *that*, and map it to whatever the current BAR value is. > Fixes: 9822504c1fa5 ("efifb: Enable the efi-framebuffer platform driver ...") > Cc: # v4.7+ > Cc: Matt Fleming > Cc: Peter Jones > Signed-off-by: Ard Biesheuvel > --- > drivers/video/fbdev/efifb.c | 66 > - > 1 file changed, 65 insertions(+), 1 deletion(-) > > diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c > index 8c4dc1e1f94f..758960b6aec9 100644 > --- a/drivers/video/fbdev/efifb.c > +++ b/drivers/video/fbdev/efifb.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -143,6 +144,8 @@ static struct attribute *efifb_attrs[] = { > }; > ATTRIBUTE_GROUPS(efifb); > > +static bool pci_dev_disabled;/* FB base matches BAR of a disabled > device */ > + > static int efifb_probe(struct platform_device *dev) > { > struct fb_info *info; > @@ -152,7 +155,7 @@ static int efifb_probe(struct platform_device *dev) > unsigned int size_total; > char *option = NULL; > > - if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > + if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) > return -ENODEV; > > if (fb_get_options("efifb", &option)) > @@ -360,3 +363,64 @@ static struct platform_driver efifb_driver = { > }; > > builtin_platform_driver(efifb_driver); > + > +#ifndef CONFIG_X86 > + > +static bool pci_bar_found; /* did we find a BAR matching the efifb base? */ > + > +static void claim_efifb_bar(struct pci_dev *dev, int idx) > +{ > + u16 word; > + > + pci_bar_found = true; > + > + pci_read_config_word(dev, PCI_COMMAND, &word); > + if (!(word & PCI_COMMAND_MEMORY)) { > + pci_dev_disabled = true; > + dev_err(&dev->dev, > + "BAR %d: assigned to efifb but device is disabled!\n", > + idx); > + return; > + } > + > + if (pci_claim_resource(dev, idx)) { > + pci_dev_disabled = true; > + dev_err(&dev->dev, > + "BAR %d: failed to claim resource for efifb!\n", idx); > + return; > + } > + > + dev_info(&dev->dev, "BAR %d: assigned to efifb\n", idx); > +} > + > +static void efifb_fixup_resources(struct pci_dev *dev) > +{ > + u64 base = screen_info.lfb_base; > + u64 size = screen_info.lfb_size; > + int i; > + > + if (pci_bar_found || screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > + return; > + > + if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) > + base |= (u64)screen_info.ext_lfb_base << 32; > + > + if (!base) > + return; > + > + for (i = 0; i < PCI_STD_RESOURCE_END; i++) { > + struct resource *res = &dev->resource[i]; > + > + if (!(res->flags & IORESOURCE_MEM)) > + continue; > + > + if (res->start <= base && res->end >= base + size - 1) {
Re: [PATCH] efi/reboot: Fall back to original power-off method if EFI_RESET_SHUTDOWN returns
On 23 April 2017 at 13:36, Hans de Goede wrote: > Commit 44be28e9dd98 ("x86/reboot: Add EFI reboot quirk for ACPI Hardware > Reduced flag") sets pm_power_off to efi_power_off() when the > acpi_gbl_reduced_hardware flag is set. > > According to its commit message this is necessary because: "BayTrail-T > class of hardware requires EFI in order to powerdown and reboot and no > other reliable method exists" > > But I have a Bay Trail CR tablet where the EFI_RESET_SHUTDOWN call does > not work, it simply returns without doing anything (AFAICT). > > So it seems that some Bay Trail devices must use EFI for power-off, while > for others only ACPI works. > > Note that efi_power_off() only gets used if the platform code defines > efi_poweroff_required() and that returns true, this currently only ever > happens on x86. > > Since on the devices which need ACPI for power-off the EFI_RESET_SHUTDOWN > call simply returns, this patch makes the efi-reboot code remember the > old pm_power_off handler and if EFI_RESET_SHUTDOWN returns it falls back > to calling that. > > This seems preferable to dmi-quirking our way out of this, since there > are likely quite a few devices suffering from this. > > Cc: Mark Salter > Signed-off-by: Hans de Goede > --- > drivers/firmware/efi/reboot.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/drivers/firmware/efi/reboot.c b/drivers/firmware/efi/reboot.c > index 62ead9b..7117e2d 100644 > --- a/drivers/firmware/efi/reboot.c > +++ b/drivers/firmware/efi/reboot.c > @@ -5,6 +5,8 @@ > #include > #include > > +void (*orig_pm_power_off)(void); > + > int efi_reboot_quirk_mode = -1; > > void efi_reboot(enum reboot_mode reboot_mode, const char *__unused) > @@ -51,6 +53,12 @@ bool __weak efi_poweroff_required(void) > static void efi_power_off(void) > { > efi.reset_system(EFI_RESET_SHUTDOWN, EFI_SUCCESS, 0, NULL); > + /* > +* The above call should not return, if it does fall back to > +* the original power off method (typically ACPI poweroff). > +*/ > + if (orig_pm_power_off) > + orig_pm_power_off(); > } > > static int __init efi_shutdown_init(void) > @@ -58,8 +66,10 @@ static int __init efi_shutdown_init(void) > if (!efi_enabled(EFI_RUNTIME_SERVICES)) > return -ENODEV; > > - if (efi_poweroff_required()) > + if (efi_poweroff_required()) { > + orig_pm_power_off = pm_power_off; > pm_power_off = efi_power_off; > + } > > return 0; > } This does not look unreasonable to me, but this is more Matt's turf so I will let him handle this one. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] efi: Move the x86 secure boot switch to generic code
First of all, apologies for taking so long to respond. On 6 April 2017 at 13:49, David Howells wrote: > Move the switch-statement in x86's setup_arch() that inteprets the > secure_boot boot parameter to generic code. > > Suggested-by: Ard Biesheuvel > Signed-off-by: David Howells > --- > > arch/x86/kernel/setup.c| 14 +- > drivers/firmware/efi/Kconfig | 23 +++ > drivers/firmware/efi/Makefile |3 ++- > drivers/firmware/efi/secure_boot.c | 34 ++ > include/linux/efi.h|6 ++ > 5 files changed, 66 insertions(+), 14 deletions(-) > create mode 100644 drivers/firmware/efi/secure_boot.c > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 4bf0c8926a1c..b89979ffa6e5 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -1178,19 +1178,7 @@ void __init setup_arch(char **cmdline_p) > /* Allocate bigger log buffer */ > setup_log_buf(1); > > - if (efi_enabled(EFI_BOOT)) { > - switch (boot_params.secure_boot) { > - case efi_secureboot_mode_disabled: > - pr_info("Secure boot disabled\n"); > - break; > - case efi_secureboot_mode_enabled: > - pr_info("Secure boot enabled\n"); > - break; > - default: > - pr_info("Secure boot could not be determined\n"); > - break; > - } > - } > + efi_set_secure_boot(boot_params.secure_boot); > > reserve_initrd(); > > diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig > index 2e78b0b96d74..4b902ffbfcf4 100644 > --- a/drivers/firmware/efi/Kconfig > +++ b/drivers/firmware/efi/Kconfig > @@ -84,6 +84,29 @@ config EFI_PARAMS_FROM_FDT > config EFI_RUNTIME_WRAPPERS > bool > > +config EFI_SECURE_BOOT > + bool "Support UEFI Secure Boot and lock down the kernel in secure > boot mode" > + default n > + help > + UEFI Secure Boot provides a mechanism for ensuring that the firmware > + will only load signed bootloaders and kernels. Secure boot mode may > + be determined from EFI variables provided by the BIOS if not Please replace 'the BIOS' with something more generic. > + indicated by the boot parameters. > + > + Enabling this option turns on support for UEFI secure boot in the > + kernel. This will result in various kernel facilities being locked > + away from userspace if the kernel detects that it has been booted in > + secure boot mode. If it hasn't been booted in secure boot mode, or > + this cannot be determined, the lock down doesn't occur. > + > + The kernel facilities that get locked down include: > + - Viewing or changing the kernel's memory > + - Directly accessing ioports > + - Directly specifying ioports and other hardware parameters to > drivers > + - Storing the kernel image unencrypted for hibernation > + - Loading unsigned modules > + - Kexec'ing unsigned images > + > config EFI_ARMSTUB > bool > > diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile > index ad67342313ed..65969f840685 100644 > --- a/drivers/firmware/efi/Makefile > +++ b/drivers/firmware/efi/Makefile > @@ -22,7 +22,8 @@ obj-$(CONFIG_EFI_FAKE_MEMMAP) += fake_mem.o > obj-$(CONFIG_EFI_BOOTLOADER_CONTROL) += efibc.o > obj-$(CONFIG_EFI_TEST) += test/ > obj-$(CONFIG_EFI_DEV_PATH_PARSER) += dev-path-parser.o > -obj-$(CONFIG_APPLE_PROPERTIES) += apple-properties.o > +obj-$(CONFIG_EFI_SECURE_BOOT) += secure_boot.o > +obj-$(CONFIG_APPLE_PROPERTIES) += apple-properties.oo Spurious change here > > arm-obj-$(CONFIG_EFI) := arm-init.o arm-runtime.o > obj-$(CONFIG_ARM) += $(arm-obj-y) > diff --git a/drivers/firmware/efi/secure_boot.c > b/drivers/firmware/efi/secure_boot.c > new file mode 100644 > index ..cf5bccae15e8 > --- /dev/null > +++ b/drivers/firmware/efi/secure_boot.c We have a file called secureboot.c in libstub/, so for consistency, could you please drop the underscore? > @@ -0,0 +1,34 @@ > +/* Core kernel secure boot support. > + * > + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. > + * Written by David Howells (dhowe...@redhat.com) > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public Licence > + * as published by the Free Software Foundation; either version > + * 2 of the Licence, or (at your option) any later version. > + */ > + > +#include > +#include > +#include > + > +/* > + * Decide what to do when UEFI secure boot mode is enabled. > + */ > +void __init efi_set_secure_boot(enum efi_secureboot_mode m
Re: [PATCH] efi-pstore: Fix read iter after pstore API refactor
On 18 May 2017 at 17:41, Kees Cook wrote: > On Thu, May 18, 2017 at 9:18 AM, Ard Biesheuvel > wrote: >> On 18 May 2017 at 14:01, Kees Cook wrote: >>> On Thu, May 18, 2017 at 3:35 AM, Ard Biesheuvel >>> wrote: On 12 May 2017 at 22:58, Kees Cook wrote: > During the internal pstore API refactoring, the EFI vars read entry was > accidentally made to update a stack variable instead of the pstore > private data pointer. This corrects the problem (and removes the now > needless argument). > > Signed-off-by: Kees Cook Does this need a cc stable? >>> >>> No, the refactor first appeared in 4.12-rc1. >>> >> >> OK. Applied. > > Err, eek, no, it's already gone into Linus's tree. > OK, I dropped it again :-) -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 32/32] x86/mm: Add support to make use of Secure Memory Encryption
On Fri, Apr 21, 2017 at 01:56:13PM -0500, Tom Lendacky wrote: > On 4/18/2017 4:22 PM, Tom Lendacky wrote: > > Add support to check if SME has been enabled and if memory encryption > > should be activated (checking of command line option based on the > > configuration of the default state). If memory encryption is to be > > activated, then the encryption mask is set and the kernel is encrypted > > "in place." > > > > Signed-off-by: Tom Lendacky > > --- > > arch/x86/kernel/head_64.S |1 + > > arch/x86/mm/mem_encrypt.c | 83 > > +++-- > > 2 files changed, 80 insertions(+), 4 deletions(-) > > > > ... > > > > > -unsigned long __init sme_enable(void) > > +unsigned long __init sme_enable(struct boot_params *bp) > > { > > + const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off; > > + unsigned int eax, ebx, ecx, edx; > > + unsigned long me_mask; > > + bool active_by_default; > > + char buffer[16]; > > So it turns out that when KASLR is enabled (CONFIG_RAMDOMIZE_BASE=y) > the stack-protector support causes issues with this function because What issues? > it is called so early. I can get past it by adding: > > CFLAGS_mem_encrypt.o := $(nostackp) > > in the arch/x86/mm/Makefile, but that obviously eliminates the support > for the whole file. Would it be better to split out the sme_enable() > and other boot routines into a separate file or just apply the > $(nostackp) to the whole file? Josh might have a better idea here... CCed. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 32/32] x86/mm: Add support to make use of Secure Memory Encryption
On Tue, Apr 18, 2017 at 04:22:23PM -0500, Tom Lendacky wrote: > Add support to check if SME has been enabled and if memory encryption > should be activated (checking of command line option based on the > configuration of the default state). If memory encryption is to be > activated, then the encryption mask is set and the kernel is encrypted > "in place." > > Signed-off-by: Tom Lendacky > --- > arch/x86/kernel/head_64.S |1 + > arch/x86/mm/mem_encrypt.c | 83 > +++-- > 2 files changed, 80 insertions(+), 4 deletions(-) ... > +unsigned long __init sme_enable(struct boot_params *bp) > { > + const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off; > + unsigned int eax, ebx, ecx, edx; > + unsigned long me_mask; > + bool active_by_default; > + char buffer[16]; > + u64 msr; > + > + /* Check for the SME support leaf */ > + eax = 0x8000; > + ecx = 0; > + native_cpuid(&eax, &ebx, &ecx, &edx); > + if (eax < 0x801f) > + goto out; > + > + /* > + * Check for the SME feature: > + * CPUID Fn8000_001F[EAX] - Bit 0 > + * Secure Memory Encryption support > + * CPUID Fn8000_001F[EBX] - Bits 5:0 > + * Pagetable bit position used to indicate encryption > + */ > + eax = 0x801f; > + ecx = 0; > + native_cpuid(&eax, &ebx, &ecx, &edx); > + if (!(eax & 1)) > + goto out; < newline here. > + me_mask = 1UL << (ebx & 0x3f); > + > + /* Check if SME is enabled */ > + msr = __rdmsr(MSR_K8_SYSCFG); > + if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT)) > + goto out; > + > + /* > + * Fixups have not been applied to phys_base yet, so we must obtain > + * the address to the SME command line option data in the following > + * way. > + */ > + asm ("lea sme_cmdline_arg(%%rip), %0" > + : "=r" (cmdline_arg) > + : "p" (sme_cmdline_arg)); > + asm ("lea sme_cmdline_on(%%rip), %0" > + : "=r" (cmdline_on) > + : "p" (sme_cmdline_on)); > + asm ("lea sme_cmdline_off(%%rip), %0" > + : "=r" (cmdline_off) > + : "p" (sme_cmdline_off)); > + > + if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT)) > + active_by_default = true; > + else > + active_by_default = false; > + > + cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr | > + ((u64)bp->ext_cmd_line_ptr << 32)); > + > + cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)); > + > + if (strncmp(buffer, cmdline_on, sizeof(buffer)) == 0) > + sme_me_mask = me_mask; Why doesn't simply if (!strncmp(buffer, "on", 2)) ... work? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html