Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-06 Thread Tang Chen

On 02/07/2013 05:54 AM, Andrew Morton wrote:

On Wed, 06 Feb 2013 10:20:57 +0800
Tang Chen  wrote:



+   if (!strncmp(p, "acpi", max(4, strlen(p
+   movablemem_map.acpi = true;


Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly "acpi" then use strcmp().  If
the incoming string must start with "acpi" then use strncmp(p, "acpi", 4).

IOW, the max is unneeded?


Hi Andrew,

I think I made another mistake here. I meant to use min(4, strlen(p)) in
case p is
something like 'aaa' whose length is less then 4. But I mistook it with
max().

But after I dig into strcmp() in the kernel, I think it is OK to use
strcmp().
min() or max() is not needed.


OK, I did that.

But the code still looks a bit more complex than we need.  Could we do

static int __init cmdline_parse_movablemem_map(char *p)
{
char *oldp;
u64 start_at, mem_size;

if (!p)
goto err;

/*
 * If user decide to use info from BIOS, all the other user specified
 * ranges will be ingored.
 */
if (!strcmp(p, "acpi")) {
movablemem_map.acpi = true;
if (movablemem_map.nr_map) {
memset(movablemem_map.map, 0,
sizeof(struct movablemem_entry)
* movablemem_map.nr_map);
movablemem_map.nr_map = 0;
}
return 0;
}



No, I don't think so.

If user specified like this:

1) movablemem_map=aaa@bbb -- will be added into array
2) movablemem_map=acpi-- will empty the array
3) movablemem_map=ccc@ddd -- will be added into array again (wrong!)

So, we need to code like this:

+   if (!strncmp(p, "acpi", max(4, strlen(p
+   movablemem_map.acpi = true;

In this way, 3) movablemem_map=ccc@ddd will not go into this if segment.

+
+   /*
+* If user decide to use info from BIOS, all the other user specified
+* ranges will be ingored.
+*/
+   if (movablemem_map.acpi) {
+   if (movablemem_map.nr_map) {
+   memset(movablemem_map.map, 0,
+   sizeof(struct movablemem_entry)
+   * movablemem_map.nr_map);
+   movablemem_map.nr_map = 0;
+   }
+   return 0;
+   }

But it will go into this if segment, and will not add the range into array.

Thanks. :)




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-06 Thread Andrew Morton
On Wed, 06 Feb 2013 10:20:57 +0800
Tang Chen  wrote:

> >>
> >> +  if (!strncmp(p, "acpi", max(4, strlen(p
> >> +  movablemem_map.acpi = true;
> >
> > Generates a warning:
> >
> > mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
> > mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a 
> > cast
> >
> > due to max(int, size_t).
> >
> > This is easily fixed, but the code looks rather pointless.  If the
> > incoming string is supposed to be exactly "acpi" then use strcmp().  If
> > the incoming string must start with "acpi" then use strncmp(p, "acpi", 4).
> >
> > IOW, the max is unneeded?
> 
> Hi Andrew,
> 
> I think I made another mistake here. I meant to use min(4, strlen(p)) in 
> case p is
> something like 'aaa' whose length is less then 4. But I mistook it with 
> max().
> 
> But after I dig into strcmp() in the kernel, I think it is OK to use 
> strcmp().
> min() or max() is not needed.

OK, I did that.

But the code still looks a bit more complex than we need.  Could we do

static int __init cmdline_parse_movablemem_map(char *p)
{
char *oldp;
u64 start_at, mem_size;

if (!p)
goto err;

/*
 * If user decide to use info from BIOS, all the other user specified
 * ranges will be ingored.
 */
if (!strcmp(p, "acpi")) {
movablemem_map.acpi = true;
if (movablemem_map.nr_map) {
memset(movablemem_map.map, 0,
sizeof(struct movablemem_entry)
* movablemem_map.nr_map);
movablemem_map.nr_map = 0;
}
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-06 Thread Andrew Morton
On Wed, 06 Feb 2013 10:20:57 +0800
Tang Chen tangc...@cn.fujitsu.com wrote:

 
  +  if (!strncmp(p, acpi, max(4, strlen(p
  +  movablemem_map.acpi = true;
 
  Generates a warning:
 
  mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
  mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a 
  cast
 
  due to max(int, size_t).
 
  This is easily fixed, but the code looks rather pointless.  If the
  incoming string is supposed to be exactly acpi then use strcmp().  If
  the incoming string must start with acpi then use strncmp(p, acpi, 4).
 
  IOW, the max is unneeded?
 
 Hi Andrew,
 
 I think I made another mistake here. I meant to use min(4, strlen(p)) in 
 case p is
 something like 'aaa' whose length is less then 4. But I mistook it with 
 max().
 
 But after I dig into strcmp() in the kernel, I think it is OK to use 
 strcmp().
 min() or max() is not needed.

OK, I did that.

But the code still looks a bit more complex than we need.  Could we do

static int __init cmdline_parse_movablemem_map(char *p)
{
char *oldp;
u64 start_at, mem_size;

if (!p)
goto err;

/*
 * If user decide to use info from BIOS, all the other user specified
 * ranges will be ingored.
 */
if (!strcmp(p, acpi)) {
movablemem_map.acpi = true;
if (movablemem_map.nr_map) {
memset(movablemem_map.map, 0,
sizeof(struct movablemem_entry)
* movablemem_map.nr_map);
movablemem_map.nr_map = 0;
}
return 0;
}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-06 Thread Tang Chen

On 02/07/2013 05:54 AM, Andrew Morton wrote:

On Wed, 06 Feb 2013 10:20:57 +0800
Tang Chentangc...@cn.fujitsu.com  wrote:



+   if (!strncmp(p, acpi, max(4, strlen(p
+   movablemem_map.acpi = true;


Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly acpi then use strcmp().  If
the incoming string must start with acpi then use strncmp(p, acpi, 4).

IOW, the max is unneeded?


Hi Andrew,

I think I made another mistake here. I meant to use min(4, strlen(p)) in
case p is
something like 'aaa' whose length is less then 4. But I mistook it with
max().

But after I dig into strcmp() in the kernel, I think it is OK to use
strcmp().
min() or max() is not needed.


OK, I did that.

But the code still looks a bit more complex than we need.  Could we do

static int __init cmdline_parse_movablemem_map(char *p)
{
char *oldp;
u64 start_at, mem_size;

if (!p)
goto err;

/*
 * If user decide to use info from BIOS, all the other user specified
 * ranges will be ingored.
 */
if (!strcmp(p, acpi)) {
movablemem_map.acpi = true;
if (movablemem_map.nr_map) {
memset(movablemem_map.map, 0,
sizeof(struct movablemem_entry)
* movablemem_map.nr_map);
movablemem_map.nr_map = 0;
}
return 0;
}



No, I don't think so.

If user specified like this:

1) movablemem_map=aaa@bbb -- will be added into array
2) movablemem_map=acpi-- will empty the array
3) movablemem_map=ccc@ddd -- will be added into array again (wrong!)

So, we need to code like this:

+   if (!strncmp(p, acpi, max(4, strlen(p
+   movablemem_map.acpi = true;

In this way, 3) movablemem_map=ccc@ddd will not go into this if segment.

+
+   /*
+* If user decide to use info from BIOS, all the other user specified
+* ranges will be ingored.
+*/
+   if (movablemem_map.acpi) {
+   if (movablemem_map.nr_map) {
+   memset(movablemem_map.map, 0,
+   sizeof(struct movablemem_entry)
+   * movablemem_map.nr_map);
+   movablemem_map.nr_map = 0;
+   }
+   return 0;
+   }

But it will go into this if segment, and will not add the range into array.

Thanks. :)




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-05 Thread Tang Chen

On 02/05/2013 07:26 AM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:


We now provide an option for users who don't want to specify physical
memory address in kernel commandline.

 /*
  * For movablemem_map=acpi:
  *
  * SRAT:|_| |_| |_| |_| ..
  * node id:0   1 1   2
  * hotpluggable:   n   y y   n
  * movablemem_map:  |_| |_|
  *
  * Using movablemem_map, we can prevent memblock from allocating memory
  * on ZONE_MOVABLE at boot time.
  */

So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

...

+   if (!strncmp(p, "acpi", max(4, strlen(p
+   movablemem_map.acpi = true;


Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly "acpi" then use strcmp().  If
the incoming string must start with "acpi" then use strncmp(p, "acpi", 4).

IOW, the max is unneeded?


Hi Andrew,

I think I made another mistake here. I meant to use min(4, strlen(p)) in 
case p is
something like 'aaa' whose length is less then 4. But I mistook it with 
max().


But after I dig into strcmp() in the kernel, I think it is OK to use 
strcmp().

min() or max() is not needed.

Thanks. :)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-05 Thread Tang Chen

On 02/05/2013 07:26 AM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chentangc...@cn.fujitsu.com  wrote:


We now provide an option for users who don't want to specify physical
memory address in kernel commandline.

 /*
  * For movablemem_map=acpi:
  *
  * SRAT:|_| |_| |_| |_| ..
  * node id:0   1 1   2
  * hotpluggable:   n   y y   n
  * movablemem_map:  |_| |_|
  *
  * Using movablemem_map, we can prevent memblock from allocating memory
  * on ZONE_MOVABLE at boot time.
  */

So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

...

+   if (!strncmp(p, acpi, max(4, strlen(p
+   movablemem_map.acpi = true;


Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly acpi then use strcmp().  If
the incoming string must start with acpi then use strncmp(p, acpi, 4).

IOW, the max is unneeded?


Hi Andrew,

I think I made another mistake here. I meant to use min(4, strlen(p)) in 
case p is
something like 'aaa' whose length is less then 4. But I mistook it with 
max().


But after I dig into strcmp() in the kernel, I think it is OK to use 
strcmp().

min() or max() is not needed.

Thanks. :)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-04 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:

> We now provide an option for users who don't want to specify physical
> memory address in kernel commandline.
> 
> /*
>  * For movablemem_map=acpi:
>  *
>  * SRAT:|_| |_| |_| |_| ..
>  * node id:0   1 1   2
>  * hotpluggable:   n   y y   n
>  * movablemem_map:  |_| |_|
>  *
>  * Using movablemem_map, we can prevent memblock from allocating 
> memory
>  * on ZONE_MOVABLE at boot time.
>  */
> 
> So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
> info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.
> 
> ...
>  
> + if (!strncmp(p, "acpi", max(4, strlen(p
> + movablemem_map.acpi = true;

Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly "acpi" then use strcmp().  If
the incoming string must start with "acpi" then use strncmp(p, "acpi", 4).

IOW, the max is unneeded?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-02-04 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen tangc...@cn.fujitsu.com wrote:

 We now provide an option for users who don't want to specify physical
 memory address in kernel commandline.
 
 /*
  * For movablemem_map=acpi:
  *
  * SRAT:|_| |_| |_| |_| ..
  * node id:0   1 1   2
  * hotpluggable:   n   y y   n
  * movablemem_map:  |_| |_|
  *
  * Using movablemem_map, we can prevent memblock from allocating 
 memory
  * on ZONE_MOVABLE at boot time.
  */
 
 So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
 info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.
 
 ...
  
 + if (!strncmp(p, acpi, max(4, strlen(p
 + movablemem_map.acpi = true;

Generates a warning:

mm/page_alloc.c: In function 'cmdline_parse_movablemem_map':
mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast

due to max(int, size_t).

This is easily fixed, but the code looks rather pointless.  If the
incoming string is supposed to be exactly acpi then use strcmp().  If
the incoming string must start with acpi then use strncmp(p, acpi, 4).

IOW, the max is unneeded?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Luck, Tony
> I will post a patch to fix it. How about always keep node0 unhotpluggable ?

Node 0 (or more specifically the node that contains memory <4GB) will be
full of BIOS reserved holes in the memory map. It probably isn't removable
even if Linux thinks it is.  Someday we might have a smart BIOS that can
relocate itself to another node - but for now making node0 unhotpluggable
looks to be a plausible interim move.

Ultimately we'd like to be able to remove any node (just not all of them at
the same time ... just like we can now offline any cpu - but not all of them
together).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Tang Chen

On 01/26/2013 09:12 AM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:


NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.


I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?


Hi Andrew,

OK,I‘ll think about it and post the testing patch. But I think a shell 
script
may be easier because the boot option is specified by user and the code 
only runs

when system is booting. So we need to reboot the system all the time.

Thanks. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Tang Chen

On 01/26/2013 09:12 AM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chentangc...@cn.fujitsu.com  wrote:


NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.


I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?


Hi Andrew,

OK,I‘ll think about it and post the testing patch. But I think a shell 
script
may be easier because the boot option is specified by user and the code 
only runs

when system is booting. So we need to reboot the system all the time.

Thanks. :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Luck, Tony
 I will post a patch to fix it. How about always keep node0 unhotpluggable ?

Node 0 (or more specifically the node that contains memory 4GB) will be
full of BIOS reserved holes in the memory map. It probably isn't removable
even if Linux thinks it is.  Someday we might have a smart BIOS that can
relocate itself to another node - but for now making node0 unhotpluggable
looks to be a plausible interim move.

Ultimately we'd like to be able to remove any node (just not all of them at
the same time ... just like we can now offline any cpu - but not all of them
together).

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-27 Thread Tang Chen

On 01/26/2013 09:29 AM, H. Peter Anvin wrote:

On 01/25/2013 05:12 PM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:


NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.


I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?



There is another problem: if ALL the nodes in the system support
hotpluggable memory, what happens?



Hi HPA,

I think I missed this case. If all the memory is hotpluggable, and user 
specified
movablemem_map=acpi, all the memory could be set as movable, and the 
kernel will

fail to start.

I will post a patch to fix it. How about always keep node0 unhotpluggable ?

Thanks. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-27 Thread Tang Chen

On 01/26/2013 09:29 AM, H. Peter Anvin wrote:

On 01/25/2013 05:12 PM, Andrew Morton wrote:

On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chentangc...@cn.fujitsu.com  wrote:


NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.


I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?



There is another problem: if ALL the nodes in the system support
hotpluggable memory, what happens?



Hi HPA,

I think I missed this case. If all the memory is hotpluggable, and user 
specified
movablemem_map=acpi, all the memory could be set as movable, and the 
kernel will

fail to start.

I will post a patch to fix it. How about always keep node0 unhotpluggable ?

Thanks. :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread H. Peter Anvin
On 01/25/2013 05:12 PM, Andrew Morton wrote:
> On Fri, 25 Jan 2013 17:42:09 +0800
> Tang Chen  wrote:
> 
>> NOTE: Using this way will cause NUMA performance down because the whole node
>>   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
>>   If users don't want to lose NUMA performance, just don't use it.
> 
> I agree with this, but it means that nobody will test any of your new code.
> 
> To get improved testing coverage, can you think of any temporary
> testing-only patch which will cause testers to exercise the
> memory-hotplug changes?
> 

There is another problem: if ALL the nodes in the system support
hotpluggable memory, what happens?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:

> NOTE: Using this way will cause NUMA performance down because the whole node
>   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
>   If users don't want to lose NUMA performance, just don't use it.

I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen  wrote:

> We now provide an option for users who don't want to specify physical
> memory address in kernel commandline.
> 
> /*
>  * For movablemem_map=acpi:
>  *
>  * SRAT:|_| |_| |_| |_| ..
>  * node id:0   1 1   2
>  * hotpluggable:   n   y y   n
>  * movablemem_map:  |_| |_|
>  *
>  * Using movablemem_map, we can prevent memblock from allocating 
> memory
>  * on ZONE_MOVABLE at boot time.
>  */
> 
> So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
> info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

Well, as a reult of my previous hackery, arch/x86/mm/srat.c now looks
rather different.  Please check it carefully and runtime test this code
when it appears in linux-next?


/*
 * ACPI 3.0 based NUMA setup
 * Copyright 2004 Andi Kleen, SuSE Labs.
 *
 * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs.
 *
 * Called from acpi_numa_init while reading the SRAT and SLIT tables.
 * Assumes all memory regions belonging to a single proximity domain
 * are in one chunk. Holes between them will be included in the node.
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int acpi_numa __initdata;

static __init int setup_node(int pxm)
{
return acpi_map_pxm_to_node(pxm);
}

static __init void bad_srat(void)
{
printk(KERN_ERR "SRAT: SRAT not used.\n");
acpi_numa = -1;
}

static __init inline int srat_disabled(void)
{
return acpi_numa < 0;
}

/* Callback for SLIT parsing */
void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
{
int i, j;

for (i = 0; i < slit->locality_count; i++)
for (j = 0; j < slit->locality_count; j++)
numa_set_distance(pxm_to_node(i), pxm_to_node(j),
slit->entry[slit->locality_count * i + j]);
}

/* Callback for Proximity Domain -> x2APIC mapping */
void __init
acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
{
int pxm, node;
int apic_id;

if (srat_disabled())
return;
if (pa->header.length < sizeof(struct acpi_srat_x2apic_cpu_affinity)) {
bad_srat();
return;
}
if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0)
return;
pxm = pa->proximity_domain;
apic_id = pa->apic_id;
if (!apic->apic_id_valid(apic_id)) {
printk(KERN_INFO "SRAT: PXM %u -> X2APIC 0x%04x ignored\n",
 pxm, apic_id);
return;
}
node = setup_node(pxm);
if (node < 0) {
printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
bad_srat();
return;
}

if (apic_id >= MAX_LOCAL_APIC) {
printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u 
skipped apicid that is too big\n", pxm, apic_id, node);
return;
}
set_apicid_to_node(apic_id, node);
node_set(node, numa_nodes_parsed);
acpi_numa = 1;
printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u\n",
   pxm, apic_id, node);
}

/* Callback for Proximity Domain -> LAPIC mapping */
void __init
acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
{
int pxm, node;
int apic_id;

if (srat_disabled())
return;
if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) {
bad_srat();
return;
}
if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0)
return;
pxm = pa->proximity_domain_lo;
if (acpi_srat_revision >= 2)
pxm |= *((unsigned int*)pa->proximity_domain_hi) << 8;
node = setup_node(pxm);
if (node < 0) {
printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm);
bad_srat();
return;
}

if (get_uv_system_type() >= UV_X2APIC)
apic_id = (pa->apic_id << 8) | pa->local_sapic_eid;
else
apic_id = pa->apic_id;

if (apic_id >= MAX_LOCAL_APIC) {
printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u 
skipped apicid that is too big\n", pxm, apic_id, node);
return;
}

set_apicid_to_node(apic_id, node);
node_set(node, numa_nodes_parsed);
acpi_numa = 1;
printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u\n",
   pxm, apic_id, node);
}

#ifdef CONFIG_MEMORY_HOTPLUG
static inline int save_add_info(void) 

[PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Tang Chen
We now provide an option for users who don't want to specify physical
memory address in kernel commandline.

/*
 * For movablemem_map=acpi:
 *
 * SRAT:|_| |_| |_| |_| ..
 * node id:0   1 1   2
 * hotpluggable:   n   y y   n
 * movablemem_map:  |_| |_|
 *
 * Using movablemem_map, we can prevent memblock from allocating memory
 * on ZONE_MOVABLE at boot time.
 */

So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

NOTE: Using this way will cause NUMA performance down because the whole node
  will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
  If users don't want to lose NUMA performance, just don't use it.

Signed-off-by: Tang Chen 
---
 Documentation/kernel-parameters.txt |   23 ++-
 arch/x86/mm/srat.c  |   22 +-
 include/linux/mm.h  |1 +
 mm/page_alloc.c |   22 +-
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 7770611..3d9dc9d 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1637,15 +1637,28 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
that the amount of memory usable for all allocations
is not too small.
 
+   movablemem_map=acpi
+   [KNL,X86,IA-64,PPC] This parameter is similar to
+   memmap except it specifies the memory map of
+   ZONE_MOVABLE.
+   This option inform the kernel to use Hot Pluggable bit
+   in flags from SRAT from ACPI BIOS to determine which
+   memory devices could be hotplugged. The corresponding
+   memory ranges will be set as ZONE_MOVABLE.
+
movablemem_map=nn[KMG]@ss[KMG]
[KNL,X86,IA-64,PPC] This parameter is similar to
memmap except it specifies the memory map of
ZONE_MOVABLE.
-   If more areas are all within one node, then from
-   lowest ss to the end of the node will be ZONE_MOVABLE.
-   If an area covers two or more nodes, the area from
-   ss to the end of the 1st node will be ZONE_MOVABLE,
-   and all the rest nodes will only have ZONE_MOVABLE.
+   If user specifies memory ranges, the info in SRAT will
+   be ingored. And it works like the following:
+   - If more ranges are all within one node, then from
+ lowest ss to the end of the node will be ZONE_MOVABLE.
+   - If a range is within a node, then from ss to the end
+ of the node will be ZONE_MOVABLE.
+   - If a range covers two or more nodes, then from ss to
+ the end of the 1st node will be ZONE_MOVABLE, and all
+ the rest nodes will only have ZONE_MOVABLE.
If memmap is specified at the same time, the
movablemem_map will be limited within the memmap
areas. If kernelcore or movablecore is also specified,
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index f841d0e..94d6e72 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -198,7 +198,23 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
end_pfn = PFN_UP(end);
 
/*
-* For movablecore_map=nn[KMG]@ss[KMG]:
+* For movablemem_map=acpi:
+*
+* SRAT:|_| |_| |_| |_| ..
+* node id:0   1 1   2
+* hotpluggable:   n   y y   n
+* movablemem_map:  |_| |_|
+*
+* Using movablemem_map, we can prevent memblock from allocating memory
+* on ZONE_MOVABLE at boot time.
+*/
+   if (hotpluggable && movablemem_map.acpi) {
+   insert_movablemem_map(start_pfn, end_pfn);
+   goto out;
+   }
+
+   /*
+* For movablemem_map=nn[KMG]@ss[KMG]:
 *
 * SRAT:|_| |_| |_| |_| ..
 * node id:0   1 1   2
@@ -207,6 +223,8 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
 *
 * Using movablemem_map, we can 

[PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Tang Chen
We now provide an option for users who don't want to specify physical
memory address in kernel commandline.

/*
 * For movablemem_map=acpi:
 *
 * SRAT:|_| |_| |_| |_| ..
 * node id:0   1 1   2
 * hotpluggable:   n   y y   n
 * movablemem_map:  |_| |_|
 *
 * Using movablemem_map, we can prevent memblock from allocating memory
 * on ZONE_MOVABLE at boot time.
 */

So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

NOTE: Using this way will cause NUMA performance down because the whole node
  will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
  If users don't want to lose NUMA performance, just don't use it.

Signed-off-by: Tang Chen tangc...@cn.fujitsu.com
---
 Documentation/kernel-parameters.txt |   23 ++-
 arch/x86/mm/srat.c  |   22 +-
 include/linux/mm.h  |1 +
 mm/page_alloc.c |   22 +-
 4 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 7770611..3d9dc9d 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1637,15 +1637,28 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
that the amount of memory usable for all allocations
is not too small.
 
+   movablemem_map=acpi
+   [KNL,X86,IA-64,PPC] This parameter is similar to
+   memmap except it specifies the memory map of
+   ZONE_MOVABLE.
+   This option inform the kernel to use Hot Pluggable bit
+   in flags from SRAT from ACPI BIOS to determine which
+   memory devices could be hotplugged. The corresponding
+   memory ranges will be set as ZONE_MOVABLE.
+
movablemem_map=nn[KMG]@ss[KMG]
[KNL,X86,IA-64,PPC] This parameter is similar to
memmap except it specifies the memory map of
ZONE_MOVABLE.
-   If more areas are all within one node, then from
-   lowest ss to the end of the node will be ZONE_MOVABLE.
-   If an area covers two or more nodes, the area from
-   ss to the end of the 1st node will be ZONE_MOVABLE,
-   and all the rest nodes will only have ZONE_MOVABLE.
+   If user specifies memory ranges, the info in SRAT will
+   be ingored. And it works like the following:
+   - If more ranges are all within one node, then from
+ lowest ss to the end of the node will be ZONE_MOVABLE.
+   - If a range is within a node, then from ss to the end
+ of the node will be ZONE_MOVABLE.
+   - If a range covers two or more nodes, then from ss to
+ the end of the 1st node will be ZONE_MOVABLE, and all
+ the rest nodes will only have ZONE_MOVABLE.
If memmap is specified at the same time, the
movablemem_map will be limited within the memmap
areas. If kernelcore or movablecore is also specified,
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index f841d0e..94d6e72 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -198,7 +198,23 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
end_pfn = PFN_UP(end);
 
/*
-* For movablecore_map=nn[KMG]@ss[KMG]:
+* For movablemem_map=acpi:
+*
+* SRAT:|_| |_| |_| |_| ..
+* node id:0   1 1   2
+* hotpluggable:   n   y y   n
+* movablemem_map:  |_| |_|
+*
+* Using movablemem_map, we can prevent memblock from allocating memory
+* on ZONE_MOVABLE at boot time.
+*/
+   if (hotpluggable  movablemem_map.acpi) {
+   insert_movablemem_map(start_pfn, end_pfn);
+   goto out;
+   }
+
+   /*
+* For movablemem_map=nn[KMG]@ss[KMG]:
 *
 * SRAT:|_| |_| |_| |_| ..
 * node id:0   1 1   2
@@ -207,6 +223,8 @@ acpi_numa_memory_affinity_init(struct 
acpi_srat_mem_affinity *ma)
 *
 * Using 

Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen tangc...@cn.fujitsu.com wrote:

 We now provide an option for users who don't want to specify physical
 memory address in kernel commandline.
 
 /*
  * For movablemem_map=acpi:
  *
  * SRAT:|_| |_| |_| |_| ..
  * node id:0   1 1   2
  * hotpluggable:   n   y y   n
  * movablemem_map:  |_| |_|
  *
  * Using movablemem_map, we can prevent memblock from allocating 
 memory
  * on ZONE_MOVABLE at boot time.
  */
 
 So user just specify movablemem_map=acpi, and the kernel will use hotpluggable
 info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE.

Well, as a reult of my previous hackery, arch/x86/mm/srat.c now looks
rather different.  Please check it carefully and runtime test this code
when it appears in linux-next?


/*
 * ACPI 3.0 based NUMA setup
 * Copyright 2004 Andi Kleen, SuSE Labs.
 *
 * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs.
 *
 * Called from acpi_numa_init while reading the SRAT and SLIT tables.
 * Assumes all memory regions belonging to a single proximity domain
 * are in one chunk. Holes between them will be included in the node.
 */

#include linux/kernel.h
#include linux/acpi.h
#include linux/mmzone.h
#include linux/bitmap.h
#include linux/module.h
#include linux/topology.h
#include linux/bootmem.h
#include linux/memblock.h
#include linux/mm.h
#include asm/proto.h
#include asm/numa.h
#include asm/e820.h
#include asm/apic.h
#include asm/uv/uv.h

int acpi_numa __initdata;

static __init int setup_node(int pxm)
{
return acpi_map_pxm_to_node(pxm);
}

static __init void bad_srat(void)
{
printk(KERN_ERR SRAT: SRAT not used.\n);
acpi_numa = -1;
}

static __init inline int srat_disabled(void)
{
return acpi_numa  0;
}

/* Callback for SLIT parsing */
void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
{
int i, j;

for (i = 0; i  slit-locality_count; i++)
for (j = 0; j  slit-locality_count; j++)
numa_set_distance(pxm_to_node(i), pxm_to_node(j),
slit-entry[slit-locality_count * i + j]);
}

/* Callback for Proximity Domain - x2APIC mapping */
void __init
acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
{
int pxm, node;
int apic_id;

if (srat_disabled())
return;
if (pa-header.length  sizeof(struct acpi_srat_x2apic_cpu_affinity)) {
bad_srat();
return;
}
if ((pa-flags  ACPI_SRAT_CPU_ENABLED) == 0)
return;
pxm = pa-proximity_domain;
apic_id = pa-apic_id;
if (!apic-apic_id_valid(apic_id)) {
printk(KERN_INFO SRAT: PXM %u - X2APIC 0x%04x ignored\n,
 pxm, apic_id);
return;
}
node = setup_node(pxm);
if (node  0) {
printk(KERN_ERR SRAT: Too many proximity domains %x\n, pxm);
bad_srat();
return;
}

if (apic_id = MAX_LOCAL_APIC) {
printk(KERN_INFO SRAT: PXM %u - APIC 0x%04x - Node %u 
skipped apicid that is too big\n, pxm, apic_id, node);
return;
}
set_apicid_to_node(apic_id, node);
node_set(node, numa_nodes_parsed);
acpi_numa = 1;
printk(KERN_INFO SRAT: PXM %u - APIC 0x%04x - Node %u\n,
   pxm, apic_id, node);
}

/* Callback for Proximity Domain - LAPIC mapping */
void __init
acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
{
int pxm, node;
int apic_id;

if (srat_disabled())
return;
if (pa-header.length != sizeof(struct acpi_srat_cpu_affinity)) {
bad_srat();
return;
}
if ((pa-flags  ACPI_SRAT_CPU_ENABLED) == 0)
return;
pxm = pa-proximity_domain_lo;
if (acpi_srat_revision = 2)
pxm |= *((unsigned int*)pa-proximity_domain_hi)  8;
node = setup_node(pxm);
if (node  0) {
printk(KERN_ERR SRAT: Too many proximity domains %x\n, pxm);
bad_srat();
return;
}

if (get_uv_system_type() = UV_X2APIC)
apic_id = (pa-apic_id  8) | pa-local_sapic_eid;
else
apic_id = pa-apic_id;

if (apic_id = MAX_LOCAL_APIC) {
printk(KERN_INFO SRAT: PXM %u - APIC 0x%02x - Node %u 
skipped apicid that is too big\n, pxm, apic_id, node);
return;
}

set_apicid_to_node(apic_id, node);
node_set(node, numa_nodes_parsed);
acpi_numa = 1;
printk(KERN_INFO SRAT: PXM %u - APIC 

Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread Andrew Morton
On Fri, 25 Jan 2013 17:42:09 +0800
Tang Chen tangc...@cn.fujitsu.com wrote:

 NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.

I agree with this, but it means that nobody will test any of your new code.

To get improved testing coverage, can you think of any temporary
testing-only patch which will cause testers to exercise the
memory-hotplug changes?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-25 Thread H. Peter Anvin
On 01/25/2013 05:12 PM, Andrew Morton wrote:
 On Fri, 25 Jan 2013 17:42:09 +0800
 Tang Chen tangc...@cn.fujitsu.com wrote:
 
 NOTE: Using this way will cause NUMA performance down because the whole node
   will be set as ZONE_MOVABLE, and kernel cannot use memory on it.
   If users don't want to lose NUMA performance, just don't use it.
 
 I agree with this, but it means that nobody will test any of your new code.
 
 To get improved testing coverage, can you think of any temporary
 testing-only patch which will cause testers to exercise the
 memory-hotplug changes?
 

There is another problem: if ALL the nodes in the system support
hotpluggable memory, what happens?

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/