Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-10 Thread Pingfan Liu
On Wed, Jan 9, 2019 at 10:25 PM Baoquan He  wrote:
>
> On 01/08/19 at 05:48pm, Mike Rapoport wrote:
> > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> > > Hi Mike,
> > >
> > > On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > > > I'm not thrilled by duplicating this code (yet again).
> > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode 
> > > > to
> > > > allocate [0, kernel_start) unconditionally.
> > > > I'd just replace you first patch in v3 [2] with something like:
> > >
> > > In initmem_init(), we will restore the top-down allocation style anyway.
> > > While reserve_crashkernel() is called after initmem_init(), it's not
> > > appropriate to adjust memblock_find_in_range_node(), and we really want
> > > to find region bottom up for crashkernel reservation, no matter where
> > > kernel is loaded, better call __memblock_find_range_bottom_up().
> > >
> > > Create a wrapper to do the necessary handling, then call
> > > __memblock_find_range_bottom_up() directly, looks better.
> >
> > What bothers me is 'the necessary handling' which is already done in
> > several places in memblock in a similar, but yet slightly different way.
>
> The page aligning for start and the mirror flag setting, I suppose.
> >
> > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different
> > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
> > allocating from the specified node and does not retry when it falls back to
> > any node. And memblock_alloc_internal() has yet another set of fallbacks.
>
> Get what you mean, seems they are trying to allocate within mirrorred
> memory region, if fail, try the non-mirrorred region. If kernel data
> allocation failed, no need to care about if it's movable or not, it need
> to live firstly. For the bottom-up allocation wrapper, maybe we need do
> like this too?
>
> >
> > So what should be the necessary handling in the wrapper for
> > __memblock_find_range_bottom_up() ?
> >
> > BTW, even without any memblock modifications, retrying allocation in
> > reserve_crashkerenel() for different ranges, like the proposal at [1] would
> > also work, wouldn't it?
>
> Yes, it also looks good. This patch only calls once, seems a simpler
> line adding.
>
> In fact, below one and this patch, both is fine to me, as long as it
> fixes the problem customers are complaining about.
>
It seems that there is divergence on opinion. Maybe it is easier to
fix this bug by dyoung's patch. I will repost his patch.

Thanks and regards,
Pingfan
> >
> > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
>
> Thanks
> Baoquan


Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-10 Thread Pingfan Liu
On Thu, Jan 10, 2019 at 3:57 PM Mike Rapoport  wrote:
>
> Hi Pingfan,
>
> On Wed, Jan 09, 2019 at 09:02:41PM +0800, Pingfan Liu wrote:
> > On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport  wrote:
> > >
> > > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> > > > Hi Mike,
> > > >
> > > > On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > > > > I'm not thrilled by duplicating this code (yet again).
> > > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up 
> > > > > mode to
> > > > > allocate [0, kernel_start) unconditionally.
> > > > > I'd just replace you first patch in v3 [2] with something like:
> > > >
> > > > In initmem_init(), we will restore the top-down allocation style anyway.
> > > > While reserve_crashkernel() is called after initmem_init(), it's not
> > > > appropriate to adjust memblock_find_in_range_node(), and we really want
> > > > to find region bottom up for crashkernel reservation, no matter where
> > > > kernel is loaded, better call __memblock_find_range_bottom_up().
> > > >
> > > > Create a wrapper to do the necessary handling, then call
> > > > __memblock_find_range_bottom_up() directly, looks better.
> > >
> > > What bothers me is 'the necessary handling' which is already done in
> > > several places in memblock in a similar, but yet slightly different way.
> > >
> > > memblock_find_in_range() and memblock_phys_alloc_nid() retry with 
> > > different
> > > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
> > > allocating from the specified node and does not retry when it falls back 
> > > to
> > > any node. And memblock_alloc_internal() has yet another set of fallbacks.
> > >
> > > So what should be the necessary handling in the wrapper for
> > > __memblock_find_range_bottom_up() ?
> > >
> > Well, it is a hard choice.
> > > BTW, even without any memblock modifications, retrying allocation in
> > > reserve_crashkerenel() for different ranges, like the proposal at [1] 
> > > would
> > > also work, wouldn't it?
> > >
> > Yes, it can work. Then is it worth to expose the bottom-up allocation
> > style beside for hotmovable purpose?
>
> Some architectures use bottom-up as a "compatability" mode with bootmem.
> And, I believe, powerpc and s390 use bottom-up to make some of the
> allocations close to the kernel.
>
Ok, got it. Thanks.

Best regards,
Pingfan

> > Thanks,
> > Pingfan
> > > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
> > >
> > > > Thanks
> > > > Baoquan
> > > >
> > > > >
> > > > > diff --git a/mm/memblock.c b/mm/memblock.c
> > > > > index 7df468c..d1b30b9 100644
> > > > > --- a/mm/memblock.c
> > > > > +++ b/mm/memblock.c
> > > > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock 
> > > > > memblock_find_in_range_node(phys_addr_t size,
> > > > >  * try bottom-up allocation only when bottom-up mode
> > > > >  * is set and @end is above the kernel image.
> > > > >  */
> > > > > -   if (memblock_bottom_up() && end > kernel_end) {
> > > > > -   phys_addr_t bottom_up_start;
> > > > > -
> > > > > -   /* make sure we will allocate above the kernel */
> > > > > -   bottom_up_start = max(start, kernel_end);
> > > > > -
> > > > > +   if (memblock_bottom_up()) {
> > > > > /* ok, try bottom-up allocation first */
> > > > > -   ret = __memblock_find_range_bottom_up(bottom_up_start, 
> > > > > end,
> > > > > +   ret = __memblock_find_range_bottom_up(start, end,
> > > > >   size, align, nid, 
> > > > > flags);
> > > > > if (ret)
> > > > > return ret;
> > > > >
> > > > > /*
> > > > > -* we always limit bottom-up allocation above the kernel,
> > > > > -* but top-down allocation doesn't have the limit, so
> > > > > -* retrying top-down allocation may succeed when bottom-up
> > > > > -* allocation failed.
> > > > > -*
> > > > >  * bottom-up allocation is expected to be fail very 
> > > > > rarely,
> > > > >  * so we use WARN_ONCE() here to see the stack trace if
> > > > >  * fail happens.
> > > > >
> > > > > [1] 
> > > > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/
> > > > > [2] 
> > > > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/
> > > > >
> > > > > > +
> > > > > > + return ret;
> > > > > > +}
> > > > > > +
> > > > > >  /**
> > > > > >   * __memblock_find_range_top_down - find free area utility, in 
> > > > > > top-down
> > > > > >   * @start: start of candidate range
> > > > > > --
> > > > > > 2.7.4
> > > > > >
> > > > >
> > > > > --
> > > > > Sincerely yours,
> > > > > Mike.
> > > > >
> > > >
> > >
> > > --
> > > Sincerely yours,
> > > Mike.
> > >
> >
>
> --
> Sincerely yours,
> Mike.
>


Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-09 Thread Mike Rapoport
Hi Pingfan,

On Wed, Jan 09, 2019 at 09:02:41PM +0800, Pingfan Liu wrote:
> On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport  wrote:
> >
> > On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> > > Hi Mike,
> > >
> > > On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > > > I'm not thrilled by duplicating this code (yet again).
> > > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode 
> > > > to
> > > > allocate [0, kernel_start) unconditionally.
> > > > I'd just replace you first patch in v3 [2] with something like:
> > >
> > > In initmem_init(), we will restore the top-down allocation style anyway.
> > > While reserve_crashkernel() is called after initmem_init(), it's not
> > > appropriate to adjust memblock_find_in_range_node(), and we really want
> > > to find region bottom up for crashkernel reservation, no matter where
> > > kernel is loaded, better call __memblock_find_range_bottom_up().
> > >
> > > Create a wrapper to do the necessary handling, then call
> > > __memblock_find_range_bottom_up() directly, looks better.
> >
> > What bothers me is 'the necessary handling' which is already done in
> > several places in memblock in a similar, but yet slightly different way.
> >
> > memblock_find_in_range() and memblock_phys_alloc_nid() retry with different
> > MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
> > allocating from the specified node and does not retry when it falls back to
> > any node. And memblock_alloc_internal() has yet another set of fallbacks.
> >
> > So what should be the necessary handling in the wrapper for
> > __memblock_find_range_bottom_up() ?
> >
> Well, it is a hard choice.
> > BTW, even without any memblock modifications, retrying allocation in
> > reserve_crashkerenel() for different ranges, like the proposal at [1] would
> > also work, wouldn't it?
> >
> Yes, it can work. Then is it worth to expose the bottom-up allocation
> style beside for hotmovable purpose?

Some architectures use bottom-up as a "compatability" mode with bootmem.
And, I believe, powerpc and s390 use bottom-up to make some of the
allocations close to the kernel.
 
> Thanks,
> Pingfan
> > [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
> >
> > > Thanks
> > > Baoquan
> > >
> > > >
> > > > diff --git a/mm/memblock.c b/mm/memblock.c
> > > > index 7df468c..d1b30b9 100644
> > > > --- a/mm/memblock.c
> > > > +++ b/mm/memblock.c
> > > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock 
> > > > memblock_find_in_range_node(phys_addr_t size,
> > > >  * try bottom-up allocation only when bottom-up mode
> > > >  * is set and @end is above the kernel image.
> > > >  */
> > > > -   if (memblock_bottom_up() && end > kernel_end) {
> > > > -   phys_addr_t bottom_up_start;
> > > > -
> > > > -   /* make sure we will allocate above the kernel */
> > > > -   bottom_up_start = max(start, kernel_end);
> > > > -
> > > > +   if (memblock_bottom_up()) {
> > > > /* ok, try bottom-up allocation first */
> > > > -   ret = __memblock_find_range_bottom_up(bottom_up_start, end,
> > > > +   ret = __memblock_find_range_bottom_up(start, end,
> > > >   size, align, nid, 
> > > > flags);
> > > > if (ret)
> > > > return ret;
> > > >
> > > > /*
> > > > -* we always limit bottom-up allocation above the kernel,
> > > > -* but top-down allocation doesn't have the limit, so
> > > > -* retrying top-down allocation may succeed when bottom-up
> > > > -* allocation failed.
> > > > -*
> > > >  * bottom-up allocation is expected to be fail very rarely,
> > > >  * so we use WARN_ONCE() here to see the stack trace if
> > > >  * fail happens.
> > > >
> > > > [1] 
> > > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/
> > > > [2] 
> > > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/
> > > >
> > > > > +
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * __memblock_find_range_top_down - find free area utility, in 
> > > > > top-down
> > > > >   * @start: start of candidate range
> > > > > --
> > > > > 2.7.4
> > > > >
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
> > > >
> > >
> >
> > --
> > Sincerely yours,
> > Mike.
> >
> 

-- 
Sincerely yours,
Mike.



Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-09 Thread Baoquan He
On 01/08/19 at 05:48pm, Mike Rapoport wrote:
> On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> > Hi Mike,
> > 
> > On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > > I'm not thrilled by duplicating this code (yet again).
> > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to
> > > allocate [0, kernel_start) unconditionally. 
> > > I'd just replace you first patch in v3 [2] with something like:
> > 
> > In initmem_init(), we will restore the top-down allocation style anyway.
> > While reserve_crashkernel() is called after initmem_init(), it's not
> > appropriate to adjust memblock_find_in_range_node(), and we really want
> > to find region bottom up for crashkernel reservation, no matter where
> > kernel is loaded, better call __memblock_find_range_bottom_up().
> > 
> > Create a wrapper to do the necessary handling, then call
> > __memblock_find_range_bottom_up() directly, looks better.
> 
> What bothers me is 'the necessary handling' which is already done in
> several places in memblock in a similar, but yet slightly different way.

The page aligning for start and the mirror flag setting, I suppose.
> 
> memblock_find_in_range() and memblock_phys_alloc_nid() retry with different
> MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
> allocating from the specified node and does not retry when it falls back to
> any node. And memblock_alloc_internal() has yet another set of fallbacks. 

Get what you mean, seems they are trying to allocate within mirrorred
memory region, if fail, try the non-mirrorred region. If kernel data
allocation failed, no need to care about if it's movable or not, it need
to live firstly. For the bottom-up allocation wrapper, maybe we need do
like this too?

> 
> So what should be the necessary handling in the wrapper for
> __memblock_find_range_bottom_up() ?
> 
> BTW, even without any memblock modifications, retrying allocation in
> reserve_crashkerenel() for different ranges, like the proposal at [1] would
> also work, wouldn't it?

Yes, it also looks good. This patch only calls once, seems a simpler
line adding. 

In fact, below one and this patch, both is fine to me, as long as it
fixes the problem customers are complaining about.

> 
> [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html

Thanks
Baoquan


Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-09 Thread Pingfan Liu
On Tue, Jan 8, 2019 at 11:49 PM Mike Rapoport  wrote:
>
> On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> > Hi Mike,
> >
> > On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > > I'm not thrilled by duplicating this code (yet again).
> > > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to
> > > allocate [0, kernel_start) unconditionally.
> > > I'd just replace you first patch in v3 [2] with something like:
> >
> > In initmem_init(), we will restore the top-down allocation style anyway.
> > While reserve_crashkernel() is called after initmem_init(), it's not
> > appropriate to adjust memblock_find_in_range_node(), and we really want
> > to find region bottom up for crashkernel reservation, no matter where
> > kernel is loaded, better call __memblock_find_range_bottom_up().
> >
> > Create a wrapper to do the necessary handling, then call
> > __memblock_find_range_bottom_up() directly, looks better.
>
> What bothers me is 'the necessary handling' which is already done in
> several places in memblock in a similar, but yet slightly different way.
>
> memblock_find_in_range() and memblock_phys_alloc_nid() retry with different
> MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
> allocating from the specified node and does not retry when it falls back to
> any node. And memblock_alloc_internal() has yet another set of fallbacks.
>
> So what should be the necessary handling in the wrapper for
> __memblock_find_range_bottom_up() ?
>
Well, it is a hard choice.
> BTW, even without any memblock modifications, retrying allocation in
> reserve_crashkerenel() for different ranges, like the proposal at [1] would
> also work, wouldn't it?
>
Yes, it can work. Then is it worth to expose the bottom-up allocation
style beside for hotmovable purpose?

Thanks,
Pingfan
> [1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
>
> > Thanks
> > Baoquan
> >
> > >
> > > diff --git a/mm/memblock.c b/mm/memblock.c
> > > index 7df468c..d1b30b9 100644
> > > --- a/mm/memblock.c
> > > +++ b/mm/memblock.c
> > > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock 
> > > memblock_find_in_range_node(phys_addr_t size,
> > >  * try bottom-up allocation only when bottom-up mode
> > >  * is set and @end is above the kernel image.
> > >  */
> > > -   if (memblock_bottom_up() && end > kernel_end) {
> > > -   phys_addr_t bottom_up_start;
> > > -
> > > -   /* make sure we will allocate above the kernel */
> > > -   bottom_up_start = max(start, kernel_end);
> > > -
> > > +   if (memblock_bottom_up()) {
> > > /* ok, try bottom-up allocation first */
> > > -   ret = __memblock_find_range_bottom_up(bottom_up_start, end,
> > > +   ret = __memblock_find_range_bottom_up(start, end,
> > >   size, align, nid, 
> > > flags);
> > > if (ret)
> > > return ret;
> > >
> > > /*
> > > -* we always limit bottom-up allocation above the kernel,
> > > -* but top-down allocation doesn't have the limit, so
> > > -* retrying top-down allocation may succeed when bottom-up
> > > -* allocation failed.
> > > -*
> > >  * bottom-up allocation is expected to be fail very rarely,
> > >  * so we use WARN_ONCE() here to see the stack trace if
> > >  * fail happens.
> > >
> > > [1] 
> > > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/
> > > [2] 
> > > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/
> > >
> > > > +
> > > > + return ret;
> > > > +}
> > > > +
> > > >  /**
> > > >   * __memblock_find_range_top_down - find free area utility, in top-down
> > > >   * @start: start of candidate range
> > > > --
> > > > 2.7.4
> > > >
> > >
> > > --
> > > Sincerely yours,
> > > Mike.
> > >
> >
>
> --
> Sincerely yours,
> Mike.
>


Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-08 Thread Mike Rapoport
On Tue, Jan 08, 2019 at 05:01:38PM +0800, Baoquan He wrote:
> Hi Mike,
> 
> On 01/08/19 at 10:05am, Mike Rapoport wrote:
> > I'm not thrilled by duplicating this code (yet again).
> > I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to
> > allocate [0, kernel_start) unconditionally. 
> > I'd just replace you first patch in v3 [2] with something like:
> 
> In initmem_init(), we will restore the top-down allocation style anyway.
> While reserve_crashkernel() is called after initmem_init(), it's not
> appropriate to adjust memblock_find_in_range_node(), and we really want
> to find region bottom up for crashkernel reservation, no matter where
> kernel is loaded, better call __memblock_find_range_bottom_up().
> 
> Create a wrapper to do the necessary handling, then call
> __memblock_find_range_bottom_up() directly, looks better.

What bothers me is 'the necessary handling' which is already done in
several places in memblock in a similar, but yet slightly different way.

memblock_find_in_range() and memblock_phys_alloc_nid() retry with different
MEMBLOCK_MIRROR, but memblock_phys_alloc_try_nid() does that only when
allocating from the specified node and does not retry when it falls back to
any node. And memblock_alloc_internal() has yet another set of fallbacks. 

So what should be the necessary handling in the wrapper for
__memblock_find_range_bottom_up() ?

BTW, even without any memblock modifications, retrying allocation in
reserve_crashkerenel() for different ranges, like the proposal at [1] would
also work, wouldn't it?

[1] http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
 
> Thanks
> Baoquan
> 
> > 
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 7df468c..d1b30b9 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -274,24 +274,14 @@ phys_addr_t __init_memblock 
> > memblock_find_in_range_node(phys_addr_t size,
> >  * try bottom-up allocation only when bottom-up mode
> >  * is set and @end is above the kernel image.
> >  */
> > -   if (memblock_bottom_up() && end > kernel_end) {
> > -   phys_addr_t bottom_up_start;
> > -
> > -   /* make sure we will allocate above the kernel */
> > -   bottom_up_start = max(start, kernel_end);
> > -
> > +   if (memblock_bottom_up()) {
> > /* ok, try bottom-up allocation first */
> > -   ret = __memblock_find_range_bottom_up(bottom_up_start, end,
> > +   ret = __memblock_find_range_bottom_up(start, end,
> >   size, align, nid, flags);
> > if (ret)
> > return ret;
> >  
> > /*
> > -* we always limit bottom-up allocation above the kernel,
> > -* but top-down allocation doesn't have the limit, so
> > -* retrying top-down allocation may succeed when bottom-up
> > -* allocation failed.
> > -*
> >  * bottom-up allocation is expected to be fail very rarely,
> >  * so we use WARN_ONCE() here to see the stack trace if
> >  * fail happens.
> > 
> > [1] 
> > https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/
> > [2] 
> > https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/
> > 
> > > +
> > > + return ret;
> > > +}
> > > +
> > >  /**
> > >   * __memblock_find_range_top_down - find free area utility, in top-down
> > >   * @start: start of candidate range
> > > -- 
> > > 2.7.4
> > > 
> > 
> > -- 
> > Sincerely yours,
> > Mike.
> > 
> 

-- 
Sincerely yours,
Mike.



Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-08 Thread Baoquan He
Hi Mike,

On 01/08/19 at 10:05am, Mike Rapoport wrote:
> I'm not thrilled by duplicating this code (yet again).
> I liked the v3 of this patch [1] more, assuming we allow bottom-up mode to
> allocate [0, kernel_start) unconditionally. 
> I'd just replace you first patch in v3 [2] with something like:

In initmem_init(), we will restore the top-down allocation style anyway.
While reserve_crashkernel() is called after initmem_init(), it's not
appropriate to adjust memblock_find_in_range_node(), and we really want
to find region bottom up for crashkernel reservation, no matter where
kernel is loaded, better call __memblock_find_range_bottom_up().

Create a wrapper to do the necessary handling, then call
__memblock_find_range_bottom_up() directly, looks better.

Thanks
Baoquan

> 
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 7df468c..d1b30b9 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -274,24 +274,14 @@ phys_addr_t __init_memblock 
> memblock_find_in_range_node(phys_addr_t size,
>* try bottom-up allocation only when bottom-up mode
>* is set and @end is above the kernel image.
>*/
> - if (memblock_bottom_up() && end > kernel_end) {
> - phys_addr_t bottom_up_start;
> -
> - /* make sure we will allocate above the kernel */
> - bottom_up_start = max(start, kernel_end);
> -
> + if (memblock_bottom_up()) {
>   /* ok, try bottom-up allocation first */
> - ret = __memblock_find_range_bottom_up(bottom_up_start, end,
> + ret = __memblock_find_range_bottom_up(start, end,
> size, align, nid, flags);
>   if (ret)
>   return ret;
>  
>   /*
> -  * we always limit bottom-up allocation above the kernel,
> -  * but top-down allocation doesn't have the limit, so
> -  * retrying top-down allocation may succeed when bottom-up
> -  * allocation failed.
> -  *
>* bottom-up allocation is expected to be fail very rarely,
>* so we use WARN_ONCE() here to see the stack trace if
>* fail happens.
> 
> [1] 
> https://lore.kernel.org/lkml/1545966002-3075-3-git-send-email-kernelf...@gmail.com/
> [2] 
> https://lore.kernel.org/lkml/1545966002-3075-2-git-send-email-kernelf...@gmail.com/
> 
> > +
> > +   return ret;
> > +}
> > +
> >  /**
> >   * __memblock_find_range_top_down - find free area utility, in top-down
> >   * @start: start of candidate range
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 


Re: [PATCHv5] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-01-08 Thread Mike Rapoport
On Mon, Jan 07, 2019 at 04:04:59PM +0800, Pingfan Liu wrote:
> Customer reported a bug on a high end server with many pcie devices, where
> kernel bootup with crashkernel=384M, and kaslr is enabled. Even
> though we still see much memory under 896 MB, the finding still failed
> intermittently. Because currently we can only find region under 896 MB,
> if w/0 ',high' specified. Then KASLR breaks 896 MB into several parts
> randomly, and crashkernel reservation need be aligned to 128 MB, that's
> why failure is found. It raises confusion to the end user that sometimes
> crashkernel=X works while sometimes fails.
> If want to make it succeed, customer can change kernel option to
> "crashkernel=384M, high". Just this give "crashkernel=xx@yy" a very
> limited space to behave even though its grammer looks more generic.
> And we can't answer questions raised from customer that confidently:
> 1) why it doesn't succeed to reserve 896 MB;
> 2) what's wrong with memory region under 4G;
> 3) why I have to add ',high', I only require 384 MB, not 3840 MB.
> 
> This patch simplifies the method suggested in the mail [1]. It just goes
> bottom-up to find a candidate region for crashkernel. The bottom-up may be
> better compatible with the old reservation style, i.e. still want to get
> memory region from 896 MB firstly, then [896 MB, 4G], finally above 4G.
> 
> There is one trivial thing about the compatibility with old kexec-tools:
> if the reserved region is above 896M, then old tool will fail to load
> bzImage. But without this patch, the old tool also fail since there is no
> memory below 896M can be reserved for crashkernel.
> 
> [1]: http://lists.infradead.org/pipermail/kexec/2017-October/019571.html
> Signed-off-by: Pingfan Liu 
> Cc: Tang Chen 
> Cc: "Rafael J. Wysocki" 
> Cc: Len Brown 
> Cc: Andrew Morton 
> Cc: Mike Rapoport 
> Cc: Michal Hocko 
> Cc: Jonathan Corbet 
> Cc: Yaowei Bai 
> Cc: Pavel Tatashin 
> Cc: Nicholas Piggin 
> Cc: Naoya Horiguchi 
> Cc: Daniel Vacek 
> Cc: Mathieu Malaterre 
> Cc: Stefan Agner 
> Cc: Dave Young 
> Cc: Baoquan He 
> Cc: ying...@kernel.org,
> Cc: vgo...@redhat.com
> Cc: linux-kernel@vger.kernel.org
> ---
> v4 -> v5:
>   add a wrapper of bottom up allocation func
> v3 -> v4:
>   instead of exporting the stage of parsing mem hotplug info, just using the 
> bottom-up allocation func directly
>  arch/x86/kernel/setup.c  |  8 
>  include/linux/memblock.h |  3 +++
>  mm/memblock.c| 29 +
>  3 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index d494b9b..80e7923 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -546,10 +546,10 @@ static void __init reserve_crashkernel(void)
>* as old kexec-tools loads bzImage below that, unless
>* "crashkernel=size[KMG],high" is specified.
>*/
> - crash_base = memblock_find_in_range(CRASH_ALIGN,
> - high ? CRASH_ADDR_HIGH_MAX
> -  : CRASH_ADDR_LOW_MAX,
> - crash_size, CRASH_ALIGN);
> + crash_base = memblock_find_range_bottom_up(CRASH_ALIGN,
> + (max_pfn * PAGE_SIZE), crash_size, CRASH_ALIGN,
> + NUMA_NO_NODE);
> +
>   if (!crash_base) {
>   pr_info("crashkernel reservation failed - No suitable 
> area found.\n");
>   return;
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index aee299a..a35ae17 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -116,6 +116,9 @@ phys_addr_t memblock_find_in_range_node(phys_addr_t size, 
> phys_addr_t align,
>   int nid, enum memblock_flags flags);
>  phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end,
>  phys_addr_t size, phys_addr_t align);
> +phys_addr_t __init_memblock
> +memblock_find_range_bottom_up(phys_addr_t start, phys_addr_t end,
> + phys_addr_t size, phys_addr_t align, int nid);
>  void memblock_allow_resize(void);
>  int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
>  int memblock_add(phys_addr_t base, phys_addr_t size);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 81ae63c..f68287e 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -192,6 +192,35 @@ __memblock_find_range_bottom_up(phys_addr_t start, 
> phys_addr_t end,
>   return 0;
>  }
> 
> +phys_addr_t __init_memblock
> +memblock_find_range_bottom_up(phys_addr_t start, phys_addr_t end,
> + phys_addr_t size, phys_addr_t align, int nid)
> +{
> + phys_addr_t ret;
> + enum memblock_flags flags = choose_memblock_flags();
> +
> + /* pump up @end */
> + if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
> + end =