[PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memblock. It is due to the memory reservation for the crash kernel which starts at 128MB offset of the first memblock. This memory reservation for the crash kernel doesn't leave enough space in the first memblock to accommodate other essential system resources. The crash kernel start address was set to 128MB offset by default to ensure that the crash kernel get some memory below the RMA region which is used to be of size 256MB. But given that the RMA region size can be 512MB or more, setting the crash kernel offset to mid of RMA size will leave enough space for kernel to allocate memory for other system resources. Since the above crash kernel offset change is only applicable to the LPAR platform, the LPAR feature detection is pushed before the crash kernel reservation. The rest of LPAR specific initialization will still be done during pseries_probe_fw_features as usual. Signed-off-by: Sourabh Jain Reported-and-tested-by: Abdul haleem --- arch/powerpc/kernel/rtas.c | 4 arch/powerpc/kexec/core.c | 15 +++ 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index ff80bbad22a5..a49137727370 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1235,6 +1235,10 @@ int __init early_init_dt_scan_rtas(unsigned long node, entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL); sizep = of_get_flat_dt_prop(node, "rtas-size", NULL); + /* need this feature to decide the crashkernel offset */ + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL)) + powerpc_firmware_features |= FW_FEATURE_LPAR; + if (basep && entryp && sizep) { rtas.base = *basep; rtas.entry = *entryp; diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c index 48525e8b5730..71b1bfdadd76 100644 --- a/arch/powerpc/kexec/core.c +++ b/arch/powerpc/kexec/core.c @@ -147,11 +147,18 @@ void __init reserve_crashkernel(void) if (!crashk_res.start) { #ifdef CONFIG_PPC64 /* -* On 64bit we split the RMO in half but cap it at half of -* a small SLB (128MB) since the crash kernel needs to place -* itself and some stacks to be in the first segment. +* On the LPAR platform place the crash kernel to mid of +* RMA size (512MB or more) to ensure the crash kernel +* gets enough space to place itself and some stack to be +* in the first segment. At the same time normal kernel +* also get enough space to allocate memory for essential +* system resource in the first segment. Keep the crash +* kernel starts at 128MB offset on other platforms. */ - crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); + if (firmware_has_feature(FW_FEATURE_LPAR)) + crashk_res.start = ppc64_rma_size / 2; + else + crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); #else crashk_res.start = KDUMP_KERNELBASE; #endif -- 2.31.1
Re: [PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
On 04/10/21 21:36, Aneesh Kumar K.V wrote: On 10/4/21 20:41, Sourabh Jain wrote: On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memory block. It is due to the reserve crashkernel area starts at 128MB offset by default and which doesn't leave enough space in the first memory block to accommodate memory for other essential system resources. Given that the RMA region size can be 512MB or more, setting the crashkernel offset to mid of RMA size will leave enough space to kernel to allocate memory for other system resources in the first memory block. Signed-off-by: Sourabh Jain Reported-and-tested-by: Abdul haleem --- arch/powerpc/kernel/rtas.c | 3 +++ arch/powerpc/kexec/core.c | 13 + 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index ff80bbad22a5..ce5e62bb4d8e 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1235,6 +1235,9 @@ int __init early_init_dt_scan_rtas(unsigned long node, entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL); sizep = of_get_flat_dt_prop(node, "rtas-size", NULL); + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL)) + powerpc_firmware_features |= FW_FEATURE_LPAR; + The equivalent check that we currently do more than checking ibm,hypertas-functions. if (!strcmp(uname, "rtas") || !strcmp(uname, "rtas@0")) { prop = of_get_flat_dt_prop(node, "ibm,hypertas-functions", ); if (prop) { powerpc_firmware_features |= FW_FEATURE_LPAR; fw_hypertas_feature_init(prop, len); } also do we expect other firmware features to be set along with FW_FEATURE_LPAR? This patch needs to move crash kernel reservation to mid point of rma size for LPAR in reserve_crashkernel() function. Since reserve_crashkernel() is called too early even before powerpc_firmware_features is set with FW_FEATURE_LPAR, the check for if (firmware_has_feature(FW_FEATURE_LPAR)) fails and hence we only need to make sure that we set this flag early during early_init_dt_scan_rtas(). The rest of the LPAR specific initialization isn't required at this point and will be still done during pseries_probe_fw_features() as usual. Thanks, Sourabh Jain
Re: [PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
Hello Aneesh, @@ -1235,6 +1235,9 @@ int __init early_init_dt_scan_rtas(unsigned long node, entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL); sizep = of_get_flat_dt_prop(node, "rtas-size", NULL); + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL)) + powerpc_firmware_features |= FW_FEATURE_LPAR; + The equivalent check that we currently do more than checking ibm,hypertas-functions. if (!strcmp(uname, "rtas") || !strcmp(uname, "rtas@0")) { prop = of_get_flat_dt_prop(node, "ibm,hypertas-functions", ); if (prop) { powerpc_firmware_features |= FW_FEATURE_LPAR; fw_hypertas_feature_init(prop, len); } If ibm,hypertas-functions prop has to be part of rtas or rtas@0 node to decide we are on LPAR then how about splitting the probe_fw_features functions into two functions, one to detect FW_FEATURE_LPAR and another function to do the rest? also do we expect other firmware features to be set along with FW_FEATURE_LPAR? No only FW_FEATURE_LPAR feature so that kernel can decide the crashkernel offset accordingly. Thanks for the review. - Sourabh Jain
Re: [PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
On 10/4/21 20:41, Sourabh Jain wrote: On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memory block. It is due to the reserve crashkernel area starts at 128MB offset by default and which doesn't leave enough space in the first memory block to accommodate memory for other essential system resources. Given that the RMA region size can be 512MB or more, setting the crashkernel offset to mid of RMA size will leave enough space to kernel to allocate memory for other system resources in the first memory block. Signed-off-by: Sourabh Jain Reported-and-tested-by: Abdul haleem --- arch/powerpc/kernel/rtas.c | 3 +++ arch/powerpc/kexec/core.c | 13 + 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index ff80bbad22a5..ce5e62bb4d8e 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1235,6 +1235,9 @@ int __init early_init_dt_scan_rtas(unsigned long node, entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL); sizep = of_get_flat_dt_prop(node, "rtas-size", NULL); + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL)) + powerpc_firmware_features |= FW_FEATURE_LPAR; + The equivalent check that we currently do more than checking ibm,hypertas-functions. if (!strcmp(uname, "rtas") || !strcmp(uname, "rtas@0")) { prop = of_get_flat_dt_prop(node, "ibm,hypertas-functions", ); if (prop) { powerpc_firmware_features |= FW_FEATURE_LPAR; fw_hypertas_feature_init(prop, len); } also do we expect other firmware features to be set along with FW_FEATURE_LPAR? if (basep && entryp && sizep) { rtas.base = *basep; rtas.entry = *entryp; diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c index 48525e8b5730..f69cf3e370ec 100644 --- a/arch/powerpc/kexec/core.c +++ b/arch/powerpc/kexec/core.c @@ -147,11 +147,16 @@ void __init reserve_crashkernel(void) if (!crashk_res.start) { #ifdef CONFIG_PPC64 /* -* On 64bit we split the RMO in half but cap it at half of -* a small SLB (128MB) since the crash kernel needs to place -* itself and some stacks to be in the first segment. +* crash kernel needs to placed in the first segment. On LPAR +* setting crash kernel start to mid of RMA size (512MB or more) +* would help primary kernel to boot properly on large config +* LPAR (with core count 192 or more) and for the reset keep +* cap the crash kernel start at 128MB offse. */ - crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); + if (firmware_has_feature(FW_FEATURE_LPAR)) + crashk_res.start = ppc64_rma_size / 2; + else + crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); #else crashk_res.start = KDUMP_KERNELBASE; #endif
[PATCH 3/3] powerpc: Set crashkernel offset to mid of RMA region
On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memory block. It is due to the reserve crashkernel area starts at 128MB offset by default and which doesn't leave enough space in the first memory block to accommodate memory for other essential system resources. Given that the RMA region size can be 512MB or more, setting the crashkernel offset to mid of RMA size will leave enough space to kernel to allocate memory for other system resources in the first memory block. Signed-off-by: Sourabh Jain Reported-and-tested-by: Abdul haleem --- arch/powerpc/kernel/rtas.c | 3 +++ arch/powerpc/kexec/core.c | 13 + 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index ff80bbad22a5..ce5e62bb4d8e 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1235,6 +1235,9 @@ int __init early_init_dt_scan_rtas(unsigned long node, entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL); sizep = of_get_flat_dt_prop(node, "rtas-size", NULL); + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL)) + powerpc_firmware_features |= FW_FEATURE_LPAR; + if (basep && entryp && sizep) { rtas.base = *basep; rtas.entry = *entryp; diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c index 48525e8b5730..f69cf3e370ec 100644 --- a/arch/powerpc/kexec/core.c +++ b/arch/powerpc/kexec/core.c @@ -147,11 +147,16 @@ void __init reserve_crashkernel(void) if (!crashk_res.start) { #ifdef CONFIG_PPC64 /* -* On 64bit we split the RMO in half but cap it at half of -* a small SLB (128MB) since the crash kernel needs to place -* itself and some stacks to be in the first segment. +* crash kernel needs to placed in the first segment. On LPAR +* setting crash kernel start to mid of RMA size (512MB or more) +* would help primary kernel to boot properly on large config +* LPAR (with core count 192 or more) and for the reset keep +* cap the crash kernel start at 128MB offse. */ - crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); + if (firmware_has_feature(FW_FEATURE_LPAR)) + crashk_res.start = ppc64_rma_size / 2; + else + crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2)); #else crashk_res.start = KDUMP_KERNELBASE; #endif -- 2.31.1