Re: [RFC PATCH v5 2/9] fadump: Reserve the memory for firmware assisted dump.
On 11/25/2011 04:32 AM, Paul Mackerras wrote: On Tue, Nov 15, 2011 at 08:43:43PM +0530, Mahesh J Salgaonkar wrote: From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Reserve the memory during early boot to preserve CPU state data, HPTE region and RMR region data in case of kernel crash. At the time of crash, powerpc firmware will store CPU state data, HPTE region data and move RMR region data to the reserved memory area. What is RMR? I don't see anywhere that you explain this acronym. Is it the same as the RMA (real mode area)? Yes, it is the same as the RMA. I think I will replace all RMR/RMO with RMA. +config FA_DUMP +bool Firmware-assisted dump Is this new fadump infrastructure intended to supersede the existing phyp dump code? Does it use the same phyp interfaces as phyp dump? If so, you should probably remove the phyp dump code and config option as the final patch in your series. +/* + * The RMR region will be saved for later dumping when kernel crashes. + * Set this to RMO size. + */ +#define RMR_START 0x0 +#define RMR_END (ppc64_rma_size) An explanation of RMR here, and what the distinction (if any) between RMR and RMA/RMO is, would help future readers. Will change this to RMA_START and RMA_END +sections = of_get_flat_dt_prop(node, ibm,configure-kernel-dump-sizes, +size); + +if (!sections) +return 0; + +num_sections = size / sizeof(struct dump_section); + +for (i = 0; i num_sections; i++) { +switch (sections[i].dump_section) { +case FADUMP_CPU_STATE_DATA: +fw_dump.cpu_state_data_size = sections[i].section_size; +break; +case FADUMP_HPTE_REGION: +fw_dump.hpte_region_size = sections[i].section_size; +break; It's generally better to use of_read_number() or of_read_ulong() to parse OF properties, rather than using a structure like this. +/* divide by 20 to get 5% of value */ +size = memblock_end_of_DRAM(); +do_div(size, 20); You could just say size = memblock_end_of_DRAM() / 20 here; no need to use do_div, since we won't be using this code on 32-bit platforms. +if (!fw_dump.fadump_supported) { +printk(KERN_ERR Firmware-assisted dump is not supported on + this hardware\n); This shouldn't be KERN_ERR; it's not an error to boot a kernel with fadump configured in on a machine that doesn't have firmware fadump support. I don't think we really need any message, but if we have one it should be KERN_INFO at most. +/* Look for fadump= cmdline option. */ +static int __init early_fadump_param(char *p) +{ +if (!p) +return 1; + +if (p[0] == '1') +fw_dump.fadump_enabled = 1; +else if (p[0] == '0') +fw_dump.fadump_enabled = 0; I think it's usual to allow on and off as values for this kind of option. There might be a handy little helper function to parse this sort of thing (but if there is I don't know what it is called). Will rework on your suggestions. Thanks for the review. Thanks, -Mahesh. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v5 2/9] fadump: Reserve the memory for firmware assisted dump.
On Tue, Nov 15, 2011 at 08:43:43PM +0530, Mahesh J Salgaonkar wrote: From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Reserve the memory during early boot to preserve CPU state data, HPTE region and RMR region data in case of kernel crash. At the time of crash, powerpc firmware will store CPU state data, HPTE region data and move RMR region data to the reserved memory area. What is RMR? I don't see anywhere that you explain this acronym. Is it the same as the RMA (real mode area)? +config FA_DUMP + bool Firmware-assisted dump Is this new fadump infrastructure intended to supersede the existing phyp dump code? Does it use the same phyp interfaces as phyp dump? If so, you should probably remove the phyp dump code and config option as the final patch in your series. +/* + * The RMR region will be saved for later dumping when kernel crashes. + * Set this to RMO size. + */ +#define RMR_START0x0 +#define RMR_END (ppc64_rma_size) An explanation of RMR here, and what the distinction (if any) between RMR and RMA/RMO is, would help future readers. + sections = of_get_flat_dt_prop(node, ibm,configure-kernel-dump-sizes, + size); + + if (!sections) + return 0; + + num_sections = size / sizeof(struct dump_section); + + for (i = 0; i num_sections; i++) { + switch (sections[i].dump_section) { + case FADUMP_CPU_STATE_DATA: + fw_dump.cpu_state_data_size = sections[i].section_size; + break; + case FADUMP_HPTE_REGION: + fw_dump.hpte_region_size = sections[i].section_size; + break; It's generally better to use of_read_number() or of_read_ulong() to parse OF properties, rather than using a structure like this. + /* divide by 20 to get 5% of value */ + size = memblock_end_of_DRAM(); + do_div(size, 20); You could just say size = memblock_end_of_DRAM() / 20 here; no need to use do_div, since we won't be using this code on 32-bit platforms. + if (!fw_dump.fadump_supported) { + printk(KERN_ERR Firmware-assisted dump is not supported on + this hardware\n); This shouldn't be KERN_ERR; it's not an error to boot a kernel with fadump configured in on a machine that doesn't have firmware fadump support. I don't think we really need any message, but if we have one it should be KERN_INFO at most. +/* Look for fadump= cmdline option. */ +static int __init early_fadump_param(char *p) +{ + if (!p) + return 1; + + if (p[0] == '1') + fw_dump.fadump_enabled = 1; + else if (p[0] == '0') + fw_dump.fadump_enabled = 0; I think it's usual to allow on and off as values for this kind of option. There might be a handy little helper function to parse this sort of thing (but if there is I don't know what it is called). Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH v5 2/9] fadump: Reserve the memory for firmware assisted dump.
From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Reserve the memory during early boot to preserve CPU state data, HPTE region and RMR region data in case of kernel crash. At the time of crash, powerpc firmware will store CPU state data, HPTE region data and move RMR region data to the reserved memory area. If the firmware-assisted dump fails to reserve the memory, then fallback to existing kexec-based kdump. The most of the code implementation to reserve memory has been adapted from phyp assisted dump implementation written by Linas Vepstas and Manish Ahuja Change in v5: - Merged patch 10/10 which introduces a config option CONFIG_FA_DUMP for firmware assisted dump feature on Powerpc (ppc64) architecture. - Increased MIN_BOOT_MEM by 64M to avoid OOM issue during network dump capture. When kdump infrastructure is configured to save vmcore over network, we run into OOM issue while loading modules related to network setup. Change in v2: - Modified to use standard pr_debug() macro. - Modified early_init_dt_scan_fw_dump() to get the size of ibm,configure-kernel-dump-sizes property and use it to iterate through an array of dump sections. - Introduced boot option 'fadump_reserve_mem=' to let user specify the fadump boot memory to be reserved. Signed-off-by: Mahesh Salgaonkar mah...@linux.vnet.ibm.com --- arch/powerpc/Kconfig | 13 ++ arch/powerpc/include/asm/fadump.h | 69 ++ arch/powerpc/kernel/Makefile |1 arch/powerpc/kernel/fadump.c | 250 + arch/powerpc/kernel/prom.c| 15 ++ 5 files changed, 347 insertions(+), 1 deletions(-) create mode 100644 arch/powerpc/include/asm/fadump.h create mode 100644 arch/powerpc/kernel/fadump.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 6926b61..7ce773c 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -379,6 +379,19 @@ config PHYP_DUMP If unsure, say N +config FA_DUMP + bool Firmware-assisted dump + depends on PPC64 PPC_RTAS CRASH_DUMP + help + A robust mechanism to get reliable kernel crash dump with + assistance from firmware. This approach does not use kexec, + instead firmware assists in booting the kdump kernel + while preserving memory contents. Firmware-assisted dump + is meant to be a kdump replacement offering robustness and + speed not possible without system firmware assistance. + + If unsure, say N + config PPCBUG_NVRAM bool Enable reading PPCBUG NVRAM during boot if PPLUS || LOPEC default y if PPC_PREP diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h new file mode 100644 index 000..86b17e8 --- /dev/null +++ b/arch/powerpc/include/asm/fadump.h @@ -0,0 +1,69 @@ +/* + * Firmware Assisted dump header file. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright 2011 IBM Corporation + * Author: Mahesh Salgaonkar mah...@linux.vnet.ibm.com + */ + +#ifndef __PPC64_FA_DUMP_H__ +#define __PPC64_FA_DUMP_H__ + +#ifdef CONFIG_FA_DUMP + +/* + * The RMR region will be saved for later dumping when kernel crashes. + * Set this to RMO size. + */ +#define RMR_START 0x0 +#define RMR_END(ppc64_rma_size) + +/* + * On some Power systems where RMO is 128MB, it still requires minimum of + * 256MB for kernel to boot successfully. When kdump infrastructure is + * configured to save vmcore over network, we run into OOM issue while + * loading modules related to network setup. Hence we need aditional 64M + * of memory to avoid OOM issue. + */ +#define MIN_BOOT_MEM (((RMR_END (0x1UL 28)) ? (0x1UL 28) : RMR_END) \ + + (0x1UL 26)) + +/* Firmware provided dump sections */ +#define FADUMP_CPU_STATE_DATA 0x0001 +#define FADUMP_HPTE_REGION 0x0002 +#define FADUMP_REAL_MODE_REGION0x0011 + +struct fw_dump { + unsigned long cpu_state_data_size; + unsigned long hpte_region_size; + unsigned long boot_memory_size; + unsigned long reserve_dump_area_start; + unsigned long reserve_dump_area_size; + /* cmd line option during boot */ + unsigned long reserve_bootvar; + + int ibm_configure_kernel_dump; + +