Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On Wed, Jan 9, 2019 at 10:25 PM Baoquan He wrote: > > On 01/08/19 at 05:48pm, Mike Rapoport wrote: > > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > > > Hi Mike, > > > > > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > > > I'm not thrilled by duplicating this code (yet again). > > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode > > > > to > > > > allocate [0, kernel_start) unconditionally. > > > > I'd just replace you first patch in v3 [2] with something like: > > > > > > In initmem_init(), we will restore the top-down allocation style anyway. > > > While reserve_crashkernel() is called after initmem_init(), it's not > > > appropriate to adjust memblock_find_in_range_node(), and we really want > > > to find region bottom up for crashkernel reservation, no matter where > > > kernel is loaded, better call __memblock_find_range_bottom_up(). > > > > > > Create a wrapper to do the necessary handling, then call > > > __memblock_find_range_bottom_up() directly, looks better. > > > > What bothers me is 'the necessary handling' which is already done in > > several places in memblock in a similar, but yet slightly different way. > > The page aligning for start and the mirror flag setting, I suppose. > > > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different > > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when > > allocating from the specified node and does not retry when it falls back to > > any node. And memblock_alloc_internal() has yet another set of fallbacks. > > Get what you mean, seems they are trying to allocate within mirrorred > memory region, if fail, try the non-mirrorred region. If kernel data > allocation failed, no need to care about if it's movable or not, it need > to live firstly. For the bottom-up allocation wrapper, maybe we need do > like this too? > > > > > So what should be the necessary handling in the wrapper for > > __memblock_find_range_bottom_up() ? > > > > BTW, even without any memblock modifications, retrying allocation in > > reserve_crashkerenel() for different ranges, like the proposal at [1] would > > also work, wouldn't it? > > Yes, it also looks good. This patch only calls once, seems a simpler > line adding. > > In fact, below one and this patch, both is fine to me, as long as it > fixes the problem customers are complaining about. > It seems that there is divergence on opinion. Maybe it is easier to fix this bug by dyoung's patch. I will repost his patch. Thanks and regards, Pingfan > > > > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > > Thanks > Baoquan
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On Thu, Jan 10, 2019 at 3:57 PM Mike Rapoport wrote: > > Hi Pingfan, > > On Wed, Jan 09, 2019 at 09:02:41PM +0800, Pingfan Liu wrote: > > On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport wrote: > > > > > > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > > > > Hi Mike, > > > > > > > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > > > > I'm not thrilled by duplicating this code (yet again). > > > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up > > > > > mode to > > > > > allocate [0, kernel_start) unconditionally. > > > > > I'd just replace you first patch in v3 [2] with something like: > > > > > > > > In initmem_init(), we will restore the top-down allocation style anyway. > > > > While reserve_crashkernel() is called after initmem_init(), it's not > > > > appropriate to adjust memblock_find_in_range_node(), and we really want > > > > to find region bottom up for crashkernel reservation, no matter where > > > > kernel is loaded, better call __memblock_find_range_bottom_up(). > > > > > > > > Create a wrapper to do the necessary handling, then call > > > > __memblock_find_range_bottom_up() directly, looks better. > > > > > > What bothers me is 'the necessary handling' which is already done in > > > several places in memblock in a similar, but yet slightly different way. > > > > > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with > > > different > > > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when > > > allocating from the specified node and does not retry when it falls back > > > to > > > any node. And memblock_alloc_internal() has yet another set of fallbacks. > > > > > > So what should be the necessary handling in the wrapper for > > > __memblock_find_range_bottom_up() ? > > > > > Well, it is a hard choice. > > > BTW, even without any memblock modifications, retrying allocation in > > > reserve_crashkerenel() for different ranges, like the proposal at [1] > > > would > > > also work, wouldn't it? > > > > > Yes, it can work. Then is it worth to expose the bottom-up allocation > > style beside for hotmovable purpose? > > Some architectures use bottom-up as a "compatability" mode with bootmem. > And, I believe, powerpc and s390 use bottom-up to make some of the > allocations close to the kernel. > Ok, got it. Thanks. Best regards, Pingfan > > Thanks, > > Pingfan > > > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > > > > > > > Thanks > > > > Baoquan > > > > > > > > > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > > > index 7df468c..d1b30b9 100644 > > > > > --- a/mm/memblock.c > > > > > +++ b/mm/memblock.c > > > > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock > > > > > memblock_find_in_range_node(phys_addr_t size, > > > > > * try bottom-up allocation only when bottom-up mode > > > > > * is set and @end is above the kernel image. > > > > > */ > > > > > - if (memblock_bottom_up() && end > kernel_end) { > > > > > - phys_addr_t bottom_up_start; > > > > > - > > > > > - /* make sure we will allocate above the kernel */ > > > > > - bottom_up_start = max(start, kernel_end); > > > > > - > > > > > + if (memblock_bottom_up()) { > > > > > /* ok, try bottom-up allocation first */ > > > > > - ret = __memblock_find_range_bottom_up(bottom_up_start, > > > > > end, > > > > > + ret = __memblock_find_range_bottom_up(start, end, > > > > > size, align, nid, > > > > > flags); > > > > > if (ret) > > > > > return ret; > > > > > > > > > > /* > > > > > -* we always limit bottom-up allocation above the kernel, > > > > > -* but top-down allocation doesn't have the limit, so > > > > > -* retrying top-down allocation may succeed when bottom-up > > > > > -* allocation failed. > > > > > -* > > > > > * bottom-up allocation is expected to be fail very > > > > > rarely, > > > > > * so we use WARN_ONCE() here to see the stack trace if > > > > > * fail happens. > > > > > > > > > > [1] > > > > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/ > > > > > [2] > > > > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/ > > > > > > > > > > > + > > > > > > + return ret; > > > > > > +} > > > > > > + > > > > > > /** > > > > > > * __memblock_find_range_top_down - find free area utility, in > > > > > > top-down > > > > > > * @start: start of candidate range > > > > > > -- > > > > > > 2.7.4 > > > > > > > > > > > > > > > > -- > > > > > Sincerely yours, > > > > > Mike. > > > > > > > > > > > > > > > -- > > > Sincerely yours, > > > Mike. > > > > > > > -- > Sincerely yours, > Mike. >
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
Hi Pingfan, On Wed, Jan 09, 2019 at 09:02:41PM +0800, Pingfan Liu wrote: > On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport wrote: > > > > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > > > Hi Mike, > > > > > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > > > I'm not thrilled by duplicating this code (yet again). > > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode > > > > to > > > > allocate [0, kernel_start) unconditionally. > > > > I'd just replace you first patch in v3 [2] with something like: > > > > > > In initmem_init(), we will restore the top-down allocation style anyway. > > > While reserve_crashkernel() is called after initmem_init(), it's not > > > appropriate to adjust memblock_find_in_range_node(), and we really want > > > to find region bottom up for crashkernel reservation, no matter where > > > kernel is loaded, better call __memblock_find_range_bottom_up(). > > > > > > Create a wrapper to do the necessary handling, then call > > > __memblock_find_range_bottom_up() directly, looks better. > > > > What bothers me is 'the necessary handling' which is already done in > > several places in memblock in a similar, but yet slightly different way. > > > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different > > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when > > allocating from the specified node and does not retry when it falls back to > > any node. And memblock_alloc_internal() has yet another set of fallbacks. > > > > So what should be the necessary handling in the wrapper for > > __memblock_find_range_bottom_up() ? > > > Well, it is a hard choice. > > BTW, even without any memblock modifications, retrying allocation in > > reserve_crashkerenel() for different ranges, like the proposal at [1] would > > also work, wouldn't it? > > > Yes, it can work. Then is it worth to expose the bottom-up allocation > style beside for hotmovable purpose? Some architectures use bottom-up as a "compatability" mode with bootmem. And, I believe, powerpc and s390 use bottom-up to make some of the allocations close to the kernel. > Thanks, > Pingfan > > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > > > > > Thanks > > > Baoquan > > > > > > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > > index 7df468c..d1b30b9 100644 > > > > --- a/mm/memblock.c > > > > +++ b/mm/memblock.c > > > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock > > > > memblock_find_in_range_node(phys_addr_t size, > > > > * try bottom-up allocation only when bottom-up mode > > > > * is set and @end is above the kernel image. > > > > */ > > > > - if (memblock_bottom_up() && end > kernel_end) { > > > > - phys_addr_t bottom_up_start; > > > > - > > > > - /* make sure we will allocate above the kernel */ > > > > - bottom_up_start = max(start, kernel_end); > > > > - > > > > + if (memblock_bottom_up()) { > > > > /* ok, try bottom-up allocation first */ > > > > - ret = __memblock_find_range_bottom_up(bottom_up_start, end, > > > > + ret = __memblock_find_range_bottom_up(start, end, > > > > size, align, nid, > > > > flags); > > > > if (ret) > > > > return ret; > > > > > > > > /* > > > > -* we always limit bottom-up allocation above the kernel, > > > > -* but top-down allocation doesn't have the limit, so > > > > -* retrying top-down allocation may succeed when bottom-up > > > > -* allocation failed. > > > > -* > > > > * bottom-up allocation is expected to be fail very rarely, > > > > * so we use WARN_ONCE() here to see the stack trace if > > > > * fail happens. > > > > > > > > [1] > > > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/ > > > > [2] > > > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/ > > > > > > > > > + > > > > > + return ret; > > > > > +} > > > > > + > > > > > /** > > > > > * __memblock_find_range_top_down - find free area utility, in > > > > > top-down > > > > > * @start: start of candidate range > > > > > -- > > > > > 2.7.4 > > > > > > > > > > > > > -- > > > > Sincerely yours, > > > > Mike. > > > > > > > > > > > -- > > Sincerely yours, > > Mike. > > > -- Sincerely yours, Mike.
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On 01/08/19 at 05:48pm, Mike Rapoport wrote: > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > > Hi Mike, > > > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > > I'm not thrilled by duplicating this code (yet again). > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to > > > allocate [0, kernel_start) unconditionally. > > > I'd just replace you first patch in v3 [2] with something like: > > > > In initmem_init(), we will restore the top-down allocation style anyway. > > While reserve_crashkernel() is called after initmem_init(), it's not > > appropriate to adjust memblock_find_in_range_node(), and we really want > > to find region bottom up for crashkernel reservation, no matter where > > kernel is loaded, better call __memblock_find_range_bottom_up(). > > > > Create a wrapper to do the necessary handling, then call > > __memblock_find_range_bottom_up() directly, looks better. > > What bothers me is 'the necessary handling' which is already done in > several places in memblock in a similar, but yet slightly different way. The page aligning for start and the mirror flag setting, I suppose. > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when > allocating from the specified node and does not retry when it falls back to > any node. And memblock_alloc_internal() has yet another set of fallbacks. Get what you mean, seems they are trying to allocate within mirrorred memory region, if fail, try the non-mirrorred region. If kernel data allocation failed, no need to care about if it's movable or not, it need to live firstly. For the bottom-up allocation wrapper, maybe we need do like this too? > > So what should be the necessary handling in the wrapper for > __memblock_find_range_bottom_up() ? > > BTW, even without any memblock modifications, retrying allocation in > reserve_crashkerenel() for different ranges, like the proposal at [1] would > also work, wouldn't it? Yes, it also looks good. This patch only calls once, seems a simpler line adding. In fact, below one and this patch, both is fine to me, as long as it fixes the problem customers are complaining about. > > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html Thanks Baoquan
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport wrote: > > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > > Hi Mike, > > > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > > I'm not thrilled by duplicating this code (yet again). > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to > > > allocate [0, kernel_start) unconditionally. > > > I'd just replace you first patch in v3 [2] with something like: > > > > In initmem_init(), we will restore the top-down allocation style anyway. > > While reserve_crashkernel() is called after initmem_init(), it's not > > appropriate to adjust memblock_find_in_range_node(), and we really want > > to find region bottom up for crashkernel reservation, no matter where > > kernel is loaded, better call __memblock_find_range_bottom_up(). > > > > Create a wrapper to do the necessary handling, then call > > __memblock_find_range_bottom_up() directly, looks better. > > What bothers me is 'the necessary handling' which is already done in > several places in memblock in a similar, but yet slightly different way. > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when > allocating from the specified node and does not retry when it falls back to > any node. And memblock_alloc_internal() has yet another set of fallbacks. > > So what should be the necessary handling in the wrapper for > __memblock_find_range_bottom_up() ? > Well, it is a hard choice. > BTW, even without any memblock modifications, retrying allocation in > reserve_crashkerenel() for different ranges, like the proposal at [1] would > also work, wouldn't it? > Yes, it can work. Then is it worth to expose the bottom-up allocation style beside for hotmovable purpose? Thanks, Pingfan > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > > > Thanks > > Baoquan > > > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index 7df468c..d1b30b9 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock > > > memblock_find_in_range_node(phys_addr_t size, > > > * try bottom-up allocation only when bottom-up mode > > > * is set and @end is above the kernel image. > > > */ > > > - if (memblock_bottom_up() && end > kernel_end) { > > > - phys_addr_t bottom_up_start; > > > - > > > - /* make sure we will allocate above the kernel */ > > > - bottom_up_start = max(start, kernel_end); > > > - > > > + if (memblock_bottom_up()) { > > > /* ok, try bottom-up allocation first */ > > > - ret = __memblock_find_range_bottom_up(bottom_up_start, end, > > > + ret = __memblock_find_range_bottom_up(start, end, > > > size, align, nid, > > > flags); > > > if (ret) > > > return ret; > > > > > > /* > > > -* we always limit bottom-up allocation above the kernel, > > > -* but top-down allocation doesn't have the limit, so > > > -* retrying top-down allocation may succeed when bottom-up > > > -* allocation failed. > > > -* > > > * bottom-up allocation is expected to be fail very rarely, > > > * so we use WARN_ONCE() here to see the stack trace if > > > * fail happens. > > > > > > [1] > > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/ > > > [2] > > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/ > > > > > > > + > > > > + return ret; > > > > +} > > > > + > > > > /** > > > > * __memblock_find_range_top_down - find free area utility, in top-down > > > > * @start: start of candidate range > > > > -- > > > > 2.7.4 > > > > > > > > > > -- > > > Sincerely yours, > > > Mike. > > > > > > > -- > Sincerely yours, > Mike. >
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote: > Hi Mike, > > On 01/08/19 at 10:05am, Mike Rapoport wrote: > > I'm not thrilled by duplicating this code (yet again). > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to > > allocate [0, kernel_start) unconditionally. > > I'd just replace you first patch in v3 [2] with something like: > > In initmem_init(), we will restore the top-down allocation style anyway. > While reserve_crashkernel() is called after initmem_init(), it's not > appropriate to adjust memblock_find_in_range_node(), and we really want > to find region bottom up for crashkernel reservation, no matter where > kernel is loaded, better call __memblock_find_range_bottom_up(). > > Create a wrapper to do the necessary handling, then call > __memblock_find_range_bottom_up() directly, looks better. What bothers me is 'the necessary handling' which is already done in several places in memblock in a similar, but yet slightly different way. memblock_find_in_range() and memblock_phys_alloc_nid() retry with different MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when allocating from the specified node and does not retry when it falls back to any node. And memblock_alloc_internal() has yet another set of fallbacks. So what should be the necessary handling in the wrapper for __memblock_find_range_bottom_up() ? BTW, even without any memblock modifications, retrying allocation in reserve_crashkerenel() for different ranges, like the proposal at [1] would also work, wouldn't it? [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > Thanks > Baoquan > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index 7df468c..d1b30b9 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock > > memblock_find_in_range_node(phys_addr_t size, > > * try bottom-up allocation only when bottom-up mode > > * is set and @end is above the kernel image. > > */ > > - if (memblock_bottom_up() && end > kernel_end) { > > - phys_addr_t bottom_up_start; > > - > > - /* make sure we will allocate above the kernel */ > > - bottom_up_start = max(start, kernel_end); > > - > > + if (memblock_bottom_up()) { > > /* ok, try bottom-up allocation first */ > > - ret = __memblock_find_range_bottom_up(bottom_up_start, end, > > + ret = __memblock_find_range_bottom_up(start, end, > > size, align, nid, flags); > > if (ret) > > return ret; > > > > /* > > -* we always limit bottom-up allocation above the kernel, > > -* but top-down allocation doesn't have the limit, so > > -* retrying top-down allocation may succeed when bottom-up > > -* allocation failed. > > -* > > * bottom-up allocation is expected to be fail very rarely, > > * so we use WARN_ONCE() here to see the stack trace if > > * fail happens. > > > > [1] > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/ > > [2] > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/ > > > > > + > > > + return ret; > > > +} > > > + > > > /** > > > * __memblock_find_range_top_down - find free area utility, in top-down > > > * @start: start of candidate range > > > -- > > > 2.7.4 > > > > > > > -- > > Sincerely yours, > > Mike. > > > -- Sincerely yours, Mike.
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
Hi Mike, On 01/08/19 at 10:05am, Mike Rapoport wrote: > I'm not thrilled by duplicating this code (yet again). > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to > allocate [0, kernel_start) unconditionally. > I'd just replace you first patch in v3 [2] with something like: In initmem_init(), we will restore the top-down allocation style anyway. While reserve_crashkernel() is called after initmem_init(), it's not appropriate to adjust memblock_find_in_range_node(), and we really want to find region bottom up for crashkernel reservation, no matter where kernel is loaded, better call __memblock_find_range_bottom_up(). Create a wrapper to do the necessary handling, then call __memblock_find_range_bottom_up() directly, looks better. Thanks Baoquan > > diff --git a/mm/memblock.c b/mm/memblock.c > index 7df468c..d1b30b9 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock > memblock_find_in_range_node(phys_addr_t size, >* try bottom-up allocation only when bottom-up mode >* is set and @end is above the kernel image. >*/ > - if (memblock_bottom_up() && end > kernel_end) { > - phys_addr_t bottom_up_start; > - > - /* make sure we will allocate above the kernel */ > - bottom_up_start = max(start, kernel_end); > - > + if (memblock_bottom_up()) { > /* ok, try bottom-up allocation first */ > - ret = __memblock_find_range_bottom_up(bottom_up_start, end, > + ret = __memblock_find_range_bottom_up(start, end, > size, align, nid, flags); > if (ret) > return ret; > > /* > - * we always limit bottom-up allocation above the kernel, > - * but top-down allocation doesn't have the limit, so > - * retrying top-down allocation may succeed when bottom-up > - * allocation failed. > - * >* bottom-up allocation is expected to be fail very rarely, >* so we use WARN_ONCE() here to see the stack trace if >* fail happens. > > [1] > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/ > [2] > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/ > > > + > > + return ret; > > +} > > + > > /** > > * __memblock_find_range_top_down - find free area utility, in top-down > > * @start: start of candidate range > > -- > > 2.7.4 > > > > -- > Sincerely yours, > Mike. >
Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr
On Mon, Jan 07, 2019 at 04:04:59PM +0800, Pingfan Liu wrote: > Customer reported a bug on a high end server with many pcie devices, where > kernel bootup with crashkernel=384M, and kaslr is enabled. Even > though we still see much memory under 896 MB, the finding still failed > intermittently. Because currently we can only find region under 896 MB, > if w/0 ',high' specified. Then KASLR breaks 896 MB into several parts > randomly, and crashkernel reservation need be aligned to 128 MB, that's > why failure is found. It raises confusion to the end user that sometimes > crashkernel=X works while sometimes fails. > If want to make it succeed, customer can change kernel option to > "crashkernel=384M, high". Just this give "crashkernel=xx@yy" a very > limited space to behave even though its grammer looks more generic. > And we can't answer questions raised from customer that confidently: > 1) why it doesn't succeed to reserve 896 MB; > 2) what's wrong with memory region under 4G; > 3) why I have to add ',high', I only require 384 MB, not 3840 MB. > > This patch simplifies the method suggested in the mail [1]. It just goes > bottom-up to find a candidate region for crashkernel. The bottom-up may be > better compatible with the old reservation style, i.e. still want to get > memory region from 896 MB firstly, then [896 MB, 4G], finally above 4G. > > There is one trivial thing about the compatibility with old kexec-tools: > if the reserved region is above 896M, then old tool will fail to load > bzImage. But without this patch, the old tool also fail since there is no > memory below 896M can be reserved for crashkernel. > > [1]: http://lists.infradead.org/pipermail/kexec/2017-October/019571.html > Signed-off-by: Pingfan Liu > Cc: Tang Chen > Cc: "Rafael J. Wysocki" > Cc: Len Brown > Cc: Andrew Morton > Cc: Mike Rapoport > Cc: Michal Hocko > Cc: Jonathan Corbet > Cc: Yaowei Bai > Cc: Pavel Tatashin > Cc: Nicholas Piggin > Cc: Naoya Horiguchi > Cc: Daniel Vacek > Cc: Mathieu Malaterre > Cc: Stefan Agner > Cc: Dave Young > Cc: Baoquan He > Cc: ying...@kernel.org, > Cc: vgo...@redhat.com > Cc: linux-kernel@vger.kernel.org > --- > v4 -> v5: > add a wrapper of bottom up allocation func > v3 -> v4: > instead of exporting the stage of parsing mem hotplug info, just using the > bottom-up allocation func directly > arch/x86/kernel/setup.c | 8 > include/linux/memblock.h | 3 +++ > mm/memblock.c| 29 + > 3 files changed, 36 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index d494b9b..80e7923 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -546,10 +546,10 @@ static void __init reserve_crashkernel(void) >* as old kexec-tools loads bzImage below that, unless >* "crashkernel=size[KMG],high" is specified. >*/ > - crash_base = memblock_find_in_range(CRASH_ALIGN, > - high ? CRASH_ADDR_HIGH_MAX > - : CRASH_ADDR_LOW_MAX, > - crash_size, CRASH_ALIGN); > + crash_base = memblock_find_range_bottom_up(CRASH_ALIGN, > + (max_pfn * PAGE_SIZE), crash_size, CRASH_ALIGN, > + NUMA_NO_NODE); > + > if (!crash_base) { > pr_info("crashkernel reservation failed - No suitable > area found.\n"); > return; > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index aee299a..a35ae17 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -116,6 +116,9 @@ phys_addr_t memblock_find_in_range_node(phys_addr_t size, > phys_addr_t align, > int nid, enum memblock_flags flags); > phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end, > phys_addr_t size, phys_addr_t align); > +phys_addr_t __init_memblock > +memblock_find_range_bottom_up(phys_addr_t start, phys_addr_t end, > + phys_addr_t size, phys_addr_t align, int nid); > void memblock_allow_resize(void); > int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid); > int memblock_add(phys_addr_t base, phys_addr_t size); > diff --git a/mm/memblock.c b/mm/memblock.c > index 81ae63c..f68287e 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -192,6 +192,35 @@ __memblock_find_range_bottom_up(phys_addr_t start, > phys_addr_t end, > return 0; > } > > +phys_addr_t __init_memblock > +memblock_find_range_bottom_up(phys_addr_t start, phys_addr_t end, > + phys_addr_t size, phys_addr_t align, int nid) > +{ > + phys_addr_t ret; > + enum memblock_flags flags = choose_memblock_flags(); > + > + /* pump up @end */ > + if (end == MEMBLOCK_ALLOC_ACCESSIBLE) > + end =