Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 02/07/2013 05:54 AM, Andrew Morton wrote: On Wed, 06 Feb 2013 10:20:57 +0800 Tang Chen wrote: + if (!strncmp(p, "acpi", max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly "acpi" then use strcmp(). If the incoming string must start with "acpi" then use strncmp(p, "acpi", 4). IOW, the max is unneeded? Hi Andrew, I think I made another mistake here. I meant to use min(4, strlen(p)) in case p is something like 'aaa' whose length is less then 4. But I mistook it with max(). But after I dig into strcmp() in the kernel, I think it is OK to use strcmp(). min() or max() is not needed. OK, I did that. But the code still looks a bit more complex than we need. Could we do static int __init cmdline_parse_movablemem_map(char *p) { char *oldp; u64 start_at, mem_size; if (!p) goto err; /* * If user decide to use info from BIOS, all the other user specified * ranges will be ingored. */ if (!strcmp(p, "acpi")) { movablemem_map.acpi = true; if (movablemem_map.nr_map) { memset(movablemem_map.map, 0, sizeof(struct movablemem_entry) * movablemem_map.nr_map); movablemem_map.nr_map = 0; } return 0; } No, I don't think so. If user specified like this: 1) movablemem_map=aaa@bbb -- will be added into array 2) movablemem_map=acpi-- will empty the array 3) movablemem_map=ccc@ddd -- will be added into array again (wrong!) So, we need to code like this: + if (!strncmp(p, "acpi", max(4, strlen(p + movablemem_map.acpi = true; In this way, 3) movablemem_map=ccc@ddd will not go into this if segment. + + /* +* If user decide to use info from BIOS, all the other user specified +* ranges will be ingored. +*/ + if (movablemem_map.acpi) { + if (movablemem_map.nr_map) { + memset(movablemem_map.map, 0, + sizeof(struct movablemem_entry) + * movablemem_map.nr_map); + movablemem_map.nr_map = 0; + } + return 0; + } But it will go into this if segment, and will not add the range into array. Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Wed, 06 Feb 2013 10:20:57 +0800 Tang Chen wrote: > >> > >> + if (!strncmp(p, "acpi", max(4, strlen(p > >> + movablemem_map.acpi = true; > > > > Generates a warning: > > > > mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': > > mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a > > cast > > > > due to max(int, size_t). > > > > This is easily fixed, but the code looks rather pointless. If the > > incoming string is supposed to be exactly "acpi" then use strcmp(). If > > the incoming string must start with "acpi" then use strncmp(p, "acpi", 4). > > > > IOW, the max is unneeded? > > Hi Andrew, > > I think I made another mistake here. I meant to use min(4, strlen(p)) in > case p is > something like 'aaa' whose length is less then 4. But I mistook it with > max(). > > But after I dig into strcmp() in the kernel, I think it is OK to use > strcmp(). > min() or max() is not needed. OK, I did that. But the code still looks a bit more complex than we need. Could we do static int __init cmdline_parse_movablemem_map(char *p) { char *oldp; u64 start_at, mem_size; if (!p) goto err; /* * If user decide to use info from BIOS, all the other user specified * ranges will be ingored. */ if (!strcmp(p, "acpi")) { movablemem_map.acpi = true; if (movablemem_map.nr_map) { memset(movablemem_map.map, 0, sizeof(struct movablemem_entry) * movablemem_map.nr_map); movablemem_map.nr_map = 0; } return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Wed, 06 Feb 2013 10:20:57 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: + if (!strncmp(p, acpi, max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly acpi then use strcmp(). If the incoming string must start with acpi then use strncmp(p, acpi, 4). IOW, the max is unneeded? Hi Andrew, I think I made another mistake here. I meant to use min(4, strlen(p)) in case p is something like 'aaa' whose length is less then 4. But I mistook it with max(). But after I dig into strcmp() in the kernel, I think it is OK to use strcmp(). min() or max() is not needed. OK, I did that. But the code still looks a bit more complex than we need. Could we do static int __init cmdline_parse_movablemem_map(char *p) { char *oldp; u64 start_at, mem_size; if (!p) goto err; /* * If user decide to use info from BIOS, all the other user specified * ranges will be ingored. */ if (!strcmp(p, acpi)) { movablemem_map.acpi = true; if (movablemem_map.nr_map) { memset(movablemem_map.map, 0, sizeof(struct movablemem_entry) * movablemem_map.nr_map); movablemem_map.nr_map = 0; } return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 02/07/2013 05:54 AM, Andrew Morton wrote: On Wed, 06 Feb 2013 10:20:57 +0800 Tang Chentangc...@cn.fujitsu.com wrote: + if (!strncmp(p, acpi, max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly acpi then use strcmp(). If the incoming string must start with acpi then use strncmp(p, acpi, 4). IOW, the max is unneeded? Hi Andrew, I think I made another mistake here. I meant to use min(4, strlen(p)) in case p is something like 'aaa' whose length is less then 4. But I mistook it with max(). But after I dig into strcmp() in the kernel, I think it is OK to use strcmp(). min() or max() is not needed. OK, I did that. But the code still looks a bit more complex than we need. Could we do static int __init cmdline_parse_movablemem_map(char *p) { char *oldp; u64 start_at, mem_size; if (!p) goto err; /* * If user decide to use info from BIOS, all the other user specified * ranges will be ingored. */ if (!strcmp(p, acpi)) { movablemem_map.acpi = true; if (movablemem_map.nr_map) { memset(movablemem_map.map, 0, sizeof(struct movablemem_entry) * movablemem_map.nr_map); movablemem_map.nr_map = 0; } return 0; } No, I don't think so. If user specified like this: 1) movablemem_map=aaa@bbb -- will be added into array 2) movablemem_map=acpi-- will empty the array 3) movablemem_map=ccc@ddd -- will be added into array again (wrong!) So, we need to code like this: + if (!strncmp(p, acpi, max(4, strlen(p + movablemem_map.acpi = true; In this way, 3) movablemem_map=ccc@ddd will not go into this if segment. + + /* +* If user decide to use info from BIOS, all the other user specified +* ranges will be ingored. +*/ + if (movablemem_map.acpi) { + if (movablemem_map.nr_map) { + memset(movablemem_map.map, 0, + sizeof(struct movablemem_entry) + * movablemem_map.nr_map); + movablemem_map.nr_map = 0; + } + return 0; + } But it will go into this if segment, and will not add the range into array. Thanks. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 02/05/2013 07:26 AM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. ... + if (!strncmp(p, "acpi", max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly "acpi" then use strcmp(). If the incoming string must start with "acpi" then use strncmp(p, "acpi", 4). IOW, the max is unneeded? Hi Andrew, I think I made another mistake here. I meant to use min(4, strlen(p)) in case p is something like 'aaa' whose length is less then 4. But I mistook it with max(). But after I dig into strcmp() in the kernel, I think it is OK to use strcmp(). min() or max() is not needed. Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 02/05/2013 07:26 AM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chentangc...@cn.fujitsu.com wrote: We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. ... + if (!strncmp(p, acpi, max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly acpi then use strcmp(). If the incoming string must start with acpi then use strncmp(p, acpi, 4). IOW, the max is unneeded? Hi Andrew, I think I made another mistake here. I meant to use min(4, strlen(p)) in case p is something like 'aaa' whose length is less then 4. But I mistook it with max(). But after I dig into strcmp() in the kernel, I think it is OK to use strcmp(). min() or max() is not needed. Thanks. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: > We now provide an option for users who don't want to specify physical > memory address in kernel commandline. > > /* > * For movablemem_map=acpi: > * > * SRAT:|_| |_| |_| |_| .. > * node id:0 1 1 2 > * hotpluggable: n y y n > * movablemem_map: |_| |_| > * > * Using movablemem_map, we can prevent memblock from allocating > memory > * on ZONE_MOVABLE at boot time. > */ > > So user just specify movablemem_map=acpi, and the kernel will use hotpluggable > info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. > > ... > > + if (!strncmp(p, "acpi", max(4, strlen(p > + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly "acpi" then use strcmp(). If the incoming string must start with "acpi" then use strncmp(p, "acpi", 4). IOW, the max is unneeded? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. ... + if (!strncmp(p, acpi, max(4, strlen(p + movablemem_map.acpi = true; Generates a warning: mm/page_alloc.c: In function 'cmdline_parse_movablemem_map': mm/page_alloc.c:5312: warning: comparison of distinct pointer types lacks a cast due to max(int, size_t). This is easily fixed, but the code looks rather pointless. If the incoming string is supposed to be exactly acpi then use strcmp(). If the incoming string must start with acpi then use strncmp(p, acpi, 4). IOW, the max is unneeded? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
> I will post a patch to fix it. How about always keep node0 unhotpluggable ? Node 0 (or more specifically the node that contains memory <4GB) will be full of BIOS reserved holes in the memory map. It probably isn't removable even if Linux thinks it is. Someday we might have a smart BIOS that can relocate itself to another node - but for now making node0 unhotpluggable looks to be a plausible interim move. Ultimately we'd like to be able to remove any node (just not all of them at the same time ... just like we can now offline any cpu - but not all of them together). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/26/2013 09:12 AM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? Hi Andrew, OK,I‘ll think about it and post the testing patch. But I think a shell script may be easier because the boot option is specified by user and the code only runs when system is booting. So we need to reboot the system all the time. Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/26/2013 09:12 AM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chentangc...@cn.fujitsu.com wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? Hi Andrew, OK,I‘ll think about it and post the testing patch. But I think a shell script may be easier because the boot option is specified by user and the code only runs when system is booting. So we need to reboot the system all the time. Thanks. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
I will post a patch to fix it. How about always keep node0 unhotpluggable ? Node 0 (or more specifically the node that contains memory 4GB) will be full of BIOS reserved holes in the memory map. It probably isn't removable even if Linux thinks it is. Someday we might have a smart BIOS that can relocate itself to another node - but for now making node0 unhotpluggable looks to be a plausible interim move. Ultimately we'd like to be able to remove any node (just not all of them at the same time ... just like we can now offline any cpu - but not all of them together). -Tony -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/26/2013 09:29 AM, H. Peter Anvin wrote: On 01/25/2013 05:12 PM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? There is another problem: if ALL the nodes in the system support hotpluggable memory, what happens? Hi HPA, I think I missed this case. If all the memory is hotpluggable, and user specified movablemem_map=acpi, all the memory could be set as movable, and the kernel will fail to start. I will post a patch to fix it. How about always keep node0 unhotpluggable ? Thanks. :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/26/2013 09:29 AM, H. Peter Anvin wrote: On 01/25/2013 05:12 PM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chentangc...@cn.fujitsu.com wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? There is another problem: if ALL the nodes in the system support hotpluggable memory, what happens? Hi HPA, I think I missed this case. If all the memory is hotpluggable, and user specified movablemem_map=acpi, all the memory could be set as movable, and the kernel will fail to start. I will post a patch to fix it. How about always keep node0 unhotpluggable ? Thanks. :) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/25/2013 05:12 PM, Andrew Morton wrote: > On Fri, 25 Jan 2013 17:42:09 +0800 > Tang Chen wrote: > >> NOTE: Using this way will cause NUMA performance down because the whole node >> will be set as ZONE_MOVABLE, and kernel cannot use memory on it. >> If users don't want to lose NUMA performance, just don't use it. > > I agree with this, but it means that nobody will test any of your new code. > > To get improved testing coverage, can you think of any temporary > testing-only patch which will cause testers to exercise the > memory-hotplug changes? > There is another problem: if ALL the nodes in the system support hotpluggable memory, what happens? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: > NOTE: Using this way will cause NUMA performance down because the whole node > will be set as ZONE_MOVABLE, and kernel cannot use memory on it. > If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen wrote: > We now provide an option for users who don't want to specify physical > memory address in kernel commandline. > > /* > * For movablemem_map=acpi: > * > * SRAT:|_| |_| |_| |_| .. > * node id:0 1 1 2 > * hotpluggable: n y y n > * movablemem_map: |_| |_| > * > * Using movablemem_map, we can prevent memblock from allocating > memory > * on ZONE_MOVABLE at boot time. > */ > > So user just specify movablemem_map=acpi, and the kernel will use hotpluggable > info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. Well, as a reult of my previous hackery, arch/x86/mm/srat.c now looks rather different. Please check it carefully and runtime test this code when it appears in linux-next? /* * ACPI 3.0 based NUMA setup * Copyright 2004 Andi Kleen, SuSE Labs. * * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs. * * Called from acpi_numa_init while reading the SRAT and SLIT tables. * Assumes all memory regions belonging to a single proximity domain * are in one chunk. Holes between them will be included in the node. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include int acpi_numa __initdata; static __init int setup_node(int pxm) { return acpi_map_pxm_to_node(pxm); } static __init void bad_srat(void) { printk(KERN_ERR "SRAT: SRAT not used.\n"); acpi_numa = -1; } static __init inline int srat_disabled(void) { return acpi_numa < 0; } /* Callback for SLIT parsing */ void __init acpi_numa_slit_init(struct acpi_table_slit *slit) { int i, j; for (i = 0; i < slit->locality_count; i++) for (j = 0; j < slit->locality_count; j++) numa_set_distance(pxm_to_node(i), pxm_to_node(j), slit->entry[slit->locality_count * i + j]); } /* Callback for Proximity Domain -> x2APIC mapping */ void __init acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa) { int pxm, node; int apic_id; if (srat_disabled()) return; if (pa->header.length < sizeof(struct acpi_srat_x2apic_cpu_affinity)) { bad_srat(); return; } if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0) return; pxm = pa->proximity_domain; apic_id = pa->apic_id; if (!apic->apic_id_valid(apic_id)) { printk(KERN_INFO "SRAT: PXM %u -> X2APIC 0x%04x ignored\n", pxm, apic_id); return; } node = setup_node(pxm); if (node < 0) { printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm); bad_srat(); return; } if (apic_id >= MAX_LOCAL_APIC) { printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u skipped apicid that is too big\n", pxm, apic_id, node); return; } set_apicid_to_node(apic_id, node); node_set(node, numa_nodes_parsed); acpi_numa = 1; printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%04x -> Node %u\n", pxm, apic_id, node); } /* Callback for Proximity Domain -> LAPIC mapping */ void __init acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa) { int pxm, node; int apic_id; if (srat_disabled()) return; if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) { bad_srat(); return; } if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0) return; pxm = pa->proximity_domain_lo; if (acpi_srat_revision >= 2) pxm |= *((unsigned int*)pa->proximity_domain_hi) << 8; node = setup_node(pxm); if (node < 0) { printk(KERN_ERR "SRAT: Too many proximity domains %x\n", pxm); bad_srat(); return; } if (get_uv_system_type() >= UV_X2APIC) apic_id = (pa->apic_id << 8) | pa->local_sapic_eid; else apic_id = pa->apic_id; if (apic_id >= MAX_LOCAL_APIC) { printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u skipped apicid that is too big\n", pxm, apic_id, node); return; } set_apicid_to_node(apic_id, node); node_set(node, numa_nodes_parsed); acpi_numa = 1; printk(KERN_INFO "SRAT: PXM %u -> APIC 0x%02x -> Node %u\n", pxm, apic_id, node); } #ifdef CONFIG_MEMORY_HOTPLUG static inline int save_add_info(void)
[PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. Signed-off-by: Tang Chen --- Documentation/kernel-parameters.txt | 23 ++- arch/x86/mm/srat.c | 22 +- include/linux/mm.h |1 + mm/page_alloc.c | 22 +- 4 files changed, 61 insertions(+), 7 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7770611..3d9dc9d 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1637,15 +1637,28 @@ bytes respectively. Such letter suffixes can also be entirely omitted. that the amount of memory usable for all allocations is not too small. + movablemem_map=acpi + [KNL,X86,IA-64,PPC] This parameter is similar to + memmap except it specifies the memory map of + ZONE_MOVABLE. + This option inform the kernel to use Hot Pluggable bit + in flags from SRAT from ACPI BIOS to determine which + memory devices could be hotplugged. The corresponding + memory ranges will be set as ZONE_MOVABLE. + movablemem_map=nn[KMG]@ss[KMG] [KNL,X86,IA-64,PPC] This parameter is similar to memmap except it specifies the memory map of ZONE_MOVABLE. - If more areas are all within one node, then from - lowest ss to the end of the node will be ZONE_MOVABLE. - If an area covers two or more nodes, the area from - ss to the end of the 1st node will be ZONE_MOVABLE, - and all the rest nodes will only have ZONE_MOVABLE. + If user specifies memory ranges, the info in SRAT will + be ingored. And it works like the following: + - If more ranges are all within one node, then from + lowest ss to the end of the node will be ZONE_MOVABLE. + - If a range is within a node, then from ss to the end + of the node will be ZONE_MOVABLE. + - If a range covers two or more nodes, then from ss to + the end of the 1st node will be ZONE_MOVABLE, and all + the rest nodes will only have ZONE_MOVABLE. If memmap is specified at the same time, the movablemem_map will be limited within the memmap areas. If kernelcore or movablecore is also specified, diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c index f841d0e..94d6e72 100644 --- a/arch/x86/mm/srat.c +++ b/arch/x86/mm/srat.c @@ -198,7 +198,23 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) end_pfn = PFN_UP(end); /* -* For movablecore_map=nn[KMG]@ss[KMG]: +* For movablemem_map=acpi: +* +* SRAT:|_| |_| |_| |_| .. +* node id:0 1 1 2 +* hotpluggable: n y y n +* movablemem_map: |_| |_| +* +* Using movablemem_map, we can prevent memblock from allocating memory +* on ZONE_MOVABLE at boot time. +*/ + if (hotpluggable && movablemem_map.acpi) { + insert_movablemem_map(start_pfn, end_pfn); + goto out; + } + + /* +* For movablemem_map=nn[KMG]@ss[KMG]: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 @@ -207,6 +223,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) * * Using movablemem_map, we can
[PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- Documentation/kernel-parameters.txt | 23 ++- arch/x86/mm/srat.c | 22 +- include/linux/mm.h |1 + mm/page_alloc.c | 22 +- 4 files changed, 61 insertions(+), 7 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7770611..3d9dc9d 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1637,15 +1637,28 @@ bytes respectively. Such letter suffixes can also be entirely omitted. that the amount of memory usable for all allocations is not too small. + movablemem_map=acpi + [KNL,X86,IA-64,PPC] This parameter is similar to + memmap except it specifies the memory map of + ZONE_MOVABLE. + This option inform the kernel to use Hot Pluggable bit + in flags from SRAT from ACPI BIOS to determine which + memory devices could be hotplugged. The corresponding + memory ranges will be set as ZONE_MOVABLE. + movablemem_map=nn[KMG]@ss[KMG] [KNL,X86,IA-64,PPC] This parameter is similar to memmap except it specifies the memory map of ZONE_MOVABLE. - If more areas are all within one node, then from - lowest ss to the end of the node will be ZONE_MOVABLE. - If an area covers two or more nodes, the area from - ss to the end of the 1st node will be ZONE_MOVABLE, - and all the rest nodes will only have ZONE_MOVABLE. + If user specifies memory ranges, the info in SRAT will + be ingored. And it works like the following: + - If more ranges are all within one node, then from + lowest ss to the end of the node will be ZONE_MOVABLE. + - If a range is within a node, then from ss to the end + of the node will be ZONE_MOVABLE. + - If a range covers two or more nodes, then from ss to + the end of the 1st node will be ZONE_MOVABLE, and all + the rest nodes will only have ZONE_MOVABLE. If memmap is specified at the same time, the movablemem_map will be limited within the memmap areas. If kernelcore or movablecore is also specified, diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c index f841d0e..94d6e72 100644 --- a/arch/x86/mm/srat.c +++ b/arch/x86/mm/srat.c @@ -198,7 +198,23 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) end_pfn = PFN_UP(end); /* -* For movablecore_map=nn[KMG]@ss[KMG]: +* For movablemem_map=acpi: +* +* SRAT:|_| |_| |_| |_| .. +* node id:0 1 1 2 +* hotpluggable: n y y n +* movablemem_map: |_| |_| +* +* Using movablemem_map, we can prevent memblock from allocating memory +* on ZONE_MOVABLE at boot time. +*/ + if (hotpluggable movablemem_map.acpi) { + insert_movablemem_map(start_pfn, end_pfn); + goto out; + } + + /* +* For movablemem_map=nn[KMG]@ss[KMG]: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 @@ -207,6 +223,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) * * Using
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: We now provide an option for users who don't want to specify physical memory address in kernel commandline. /* * For movablemem_map=acpi: * * SRAT:|_| |_| |_| |_| .. * node id:0 1 1 2 * hotpluggable: n y y n * movablemem_map: |_| |_| * * Using movablemem_map, we can prevent memblock from allocating memory * on ZONE_MOVABLE at boot time. */ So user just specify movablemem_map=acpi, and the kernel will use hotpluggable info in SRAT to determine which memory ranges should be set as ZONE_MOVABLE. Well, as a reult of my previous hackery, arch/x86/mm/srat.c now looks rather different. Please check it carefully and runtime test this code when it appears in linux-next? /* * ACPI 3.0 based NUMA setup * Copyright 2004 Andi Kleen, SuSE Labs. * * Reads the ACPI SRAT table to figure out what memory belongs to which CPUs. * * Called from acpi_numa_init while reading the SRAT and SLIT tables. * Assumes all memory regions belonging to a single proximity domain * are in one chunk. Holes between them will be included in the node. */ #include linux/kernel.h #include linux/acpi.h #include linux/mmzone.h #include linux/bitmap.h #include linux/module.h #include linux/topology.h #include linux/bootmem.h #include linux/memblock.h #include linux/mm.h #include asm/proto.h #include asm/numa.h #include asm/e820.h #include asm/apic.h #include asm/uv/uv.h int acpi_numa __initdata; static __init int setup_node(int pxm) { return acpi_map_pxm_to_node(pxm); } static __init void bad_srat(void) { printk(KERN_ERR SRAT: SRAT not used.\n); acpi_numa = -1; } static __init inline int srat_disabled(void) { return acpi_numa 0; } /* Callback for SLIT parsing */ void __init acpi_numa_slit_init(struct acpi_table_slit *slit) { int i, j; for (i = 0; i slit-locality_count; i++) for (j = 0; j slit-locality_count; j++) numa_set_distance(pxm_to_node(i), pxm_to_node(j), slit-entry[slit-locality_count * i + j]); } /* Callback for Proximity Domain - x2APIC mapping */ void __init acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa) { int pxm, node; int apic_id; if (srat_disabled()) return; if (pa-header.length sizeof(struct acpi_srat_x2apic_cpu_affinity)) { bad_srat(); return; } if ((pa-flags ACPI_SRAT_CPU_ENABLED) == 0) return; pxm = pa-proximity_domain; apic_id = pa-apic_id; if (!apic-apic_id_valid(apic_id)) { printk(KERN_INFO SRAT: PXM %u - X2APIC 0x%04x ignored\n, pxm, apic_id); return; } node = setup_node(pxm); if (node 0) { printk(KERN_ERR SRAT: Too many proximity domains %x\n, pxm); bad_srat(); return; } if (apic_id = MAX_LOCAL_APIC) { printk(KERN_INFO SRAT: PXM %u - APIC 0x%04x - Node %u skipped apicid that is too big\n, pxm, apic_id, node); return; } set_apicid_to_node(apic_id, node); node_set(node, numa_nodes_parsed); acpi_numa = 1; printk(KERN_INFO SRAT: PXM %u - APIC 0x%04x - Node %u\n, pxm, apic_id, node); } /* Callback for Proximity Domain - LAPIC mapping */ void __init acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa) { int pxm, node; int apic_id; if (srat_disabled()) return; if (pa-header.length != sizeof(struct acpi_srat_cpu_affinity)) { bad_srat(); return; } if ((pa-flags ACPI_SRAT_CPU_ENABLED) == 0) return; pxm = pa-proximity_domain_lo; if (acpi_srat_revision = 2) pxm |= *((unsigned int*)pa-proximity_domain_hi) 8; node = setup_node(pxm); if (node 0) { printk(KERN_ERR SRAT: Too many proximity domains %x\n, pxm); bad_srat(); return; } if (get_uv_system_type() = UV_X2APIC) apic_id = (pa-apic_id 8) | pa-local_sapic_eid; else apic_id = pa-apic_id; if (apic_id = MAX_LOCAL_APIC) { printk(KERN_INFO SRAT: PXM %u - APIC 0x%02x - Node %u skipped apicid that is too big\n, pxm, apic_id, node); return; } set_apicid_to_node(apic_id, node); node_set(node, numa_nodes_parsed); acpi_numa = 1; printk(KERN_INFO SRAT: PXM %u - APIC
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.
On 01/25/2013 05:12 PM, Andrew Morton wrote: On Fri, 25 Jan 2013 17:42:09 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: NOTE: Using this way will cause NUMA performance down because the whole node will be set as ZONE_MOVABLE, and kernel cannot use memory on it. If users don't want to lose NUMA performance, just don't use it. I agree with this, but it means that nobody will test any of your new code. To get improved testing coverage, can you think of any temporary testing-only patch which will cause testers to exercise the memory-hotplug changes? There is another problem: if ALL the nodes in the system support hotpluggable memory, what happens? -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/