Re: [PATCH 1/4][V2] powerpc : add support for linux, usable-memory properties for drconf memory
Scan for linux,usable-memory properties in case of dynamic reconfiguration memory . Support for kexec/kdump. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- Patch applies on powerpc tree. Patch was reviewed by Nathan Fontenot, Stephen Rothwell, Michael Neuling. arch/powerpc/kernel/prom.c | 40 +++-- arch/powerpc/mm/numa.c | 48 --- 2 files changed, 65 insertions(+), 23 deletions(-) diff -Naurp powerpc-orig/arch/powerpc/kernel/prom.c powerpc/arch/powerpc/kernel/prom.c --- powerpc-orig/arch/powerpc/kernel/prom.c 2008-07-22 14:11:53.0 +0530 +++ powerpc/arch/powerpc/kernel/prom.c 2008-07-22 14:12:17.0 +0530 @@ -888,9 +888,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; - unsigned long l, n, flags; + cell_t *dm, *ls, *usm; + unsigned long l, n, flags, ranges; u64 base, size, lmb_size; + char buf[32]; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -914,14 +915,37 @@ static int __init early_init_dt_scan_drc or if the block is not assigned to this partition (0x8) */ if ((flags & 0x80) || !(flags & 0x8)) continue; - size = lmb_size; - if (iommu_is_off) { + if (iommu_is_off) if (base >= 0x8000ul) continue; - if ((base + size) > 0x8000ul) - size = 0x8000ul - base; - } - lmb_add(base, size); + size = lmb_size; + + /* +* Append 'n' to 'linux,usable-memory' to get special +* properties passed in by tools like kexec-tools. Relevant +* only if this is a kexec/kdump kernel. +*/ + sprintf(buf, "linux,usable-memory%d", (int)n); + usm = of_get_flat_dt_prop(node, buf, &l); + ranges = 1; + if (usm != NULL) + ranges = (l >> 2)/(dt_root_addr_cells + + dt_root_size_cells); + do { + if (usm != NULL) { + base = dt_mem_next_cell(dt_root_addr_cells, +&usm); + size = dt_mem_next_cell(dt_root_size_cells, +&usm); + if (size == 0) + break; + } + if (iommu_is_off) + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + + lmb_add(base, size); + } while (--ranges); } lmb_dump_all(); return 0; diff -Naurp powerpc-orig/arch/powerpc/mm/numa.c powerpc/arch/powerpc/mm/numa.c --- powerpc-orig/arch/powerpc/mm/numa.c 2008-07-22 14:11:53.0 +0530 +++ powerpc/arch/powerpc/mm/numa.c 2008-07-22 14:12:17.0 +0530 @@ -493,11 +493,13 @@ static unsigned long __init numa_enforce */ static void __init parse_drconf_memory(struct device_node *memory) { - const u32 *dm; - unsigned int n, rc; - unsigned long lmb_size, size; + const u32 *dm, *usm; + unsigned int n, rc, len, ranges; + unsigned long lmb_size, size, sz; int nid; struct assoc_arrays aa; + char buf[32]; + u64 base; n = of_get_drconf_memory(memory, &dm); if (!n) @@ -524,19 +526,35 @@ static void __init parse_drconf_memory(s nid = of_drconf_to_nid_single(&drmem, &aa); - fake_numa_create_new_node( - ((drmem.base_addr + lmb_size) >> PAGE_SHIFT), - &nid); - - node_set_online(nid); - - size = numa_enforce_memory_limit(drmem.base_addr, lmb_size); - if (!size) - continue; + /* +* Append 'n' to 'linux,usable-memory' to get special +* properties passed in by tools like kexec-tools. Relevant +* only if this is a kexec/kdump kernel. +*/ + sprintf(buf, "linux,usable-memory%d", (int)n); + usm = of_get_property(memory, buf, &len); + ranges = 1; + if (usm != NULL) + ranges = (len >> 2) / +
Re: [PATCH 1/4][V2] powerpc : add support for linux, usable-memory properties for drconf memory
On Tuesday 22 July 2008 14:46:20 Paul Mackerras wrote: > Chandru writes: > > > Scan for linux,usable-memory properties in case of dynamic reconfiguration > > memory . Support for kexec/kdump. > > > > Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> > > Could we *please* have a more comprehensive patch description that > that? Something which will help people coming along in two (or five > or ten) years time to understand what problem exists in the code, how > this patch solves it, and why this approach was chosen over any > alternative approaches? > > Thanks, > Paul. > Another alternate approach could be to create one 'linux,usable-drconf-memory' property and add all the usable memory regions into it, in a similar fashion to ibm,dynamic-memory property. For a given lmb in ibm,dynamic-memory , a corresponding usable-memory entry could be created which will contain 1 or more of (base,size) duple. For each entry in this new 'linux,usable-drconf-memory' property, a counter within it will tell us how many (base,size) duple are available in it. Some part of the current code may get duplicated. Let me go back and check if I can work on a patch for this approach . thx, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/2][V3] powerpc: add support for dynamic reconfiguration memory in kexec/kdump kernels
kdump kernel needs to use only those memory regions that it is allowed to use (crashkernel, rtas, tce ..etc ). Each of these regions have their own sizes and are currently added under 'linux,usable-memory' property under each memory@ node of the device tree. ibm,dynamic-memory property of ibm,dynamic-reconfiguration-memory node now stores in it the representation for most of the logical memory blocks with the size of each memory block being a constant (lmb_size). If one or more or part of the above mentioned regions lie under one of the lmb from ibm,dynamic-memory property, there is a need to identify those regions within the given lmb. Following patch recognizes a new property 'linux,drconf-usable-memory' property added by kexec-tools. Each entry in this property is of the form 'a counter' followed by those many (base, size) duple for the above mentioned regions. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- These patches were sent earlier but these are V3 of the patches. Pls let me know your thoughts. Thanks. arch/powerpc/kernel/prom.c | 40 +++-- arch/powerpc/mm/numa.c | 79 +++ 2 files changed, 96 insertions(+), 23 deletions(-) diff -Naurp powerpc-orig/arch/powerpc/kernel/prom.c powerpc/arch/powerpc/kernel/prom.c --- powerpc-orig/arch/powerpc/kernel/prom.c 2008-08-14 08:23:25.0 +0530 +++ powerpc/arch/powerpc/kernel/prom.c 2008-08-14 14:35:24.0 +0530 @@ -888,9 +888,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; + cell_t *dm, *ls, *usm; unsigned long l, n, flags; u64 base, size, lmb_size; + unsigned int is_kexec_kdump = 0, rngs; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -905,6 +906,12 @@ static int __init early_init_dt_scan_drc if (l < (n * (dt_root_addr_cells + 4) + 1) * sizeof(cell_t)) return 0; + /* check if this is a kexec/kdump kernel. */ + usm = (cell_t *)of_get_flat_dt_prop(node, "linux,drconf-usable-memory", +&l); + if (usm != NULL) + is_kexec_kdump = 1; + for (; n != 0; --n) { base = dt_mem_next_cell(dt_root_addr_cells, &dm); flags = dm[3]; @@ -915,13 +922,34 @@ static int __init early_init_dt_scan_drc if ((flags & 0x80) || !(flags & 0x8)) continue; size = lmb_size; - if (iommu_is_off) { - if (base >= 0x8000ul) + rngs = 1; + if (is_kexec_kdump) { + /* +* For each lmb in ibm,dynamic-memory, a corresponding +* entry in linux,drconf-usable-memory property contains +* a counter 'p' followed by 'p' (base, size) duple. +* Now read the counter from +* linux,drconf-usable-memory property +*/ + rngs = dt_mem_next_cell(dt_root_size_cells, &usm); + if (!rngs) /* there are no (base, size) duple */ continue; - if ((base + size) > 0x8000ul) - size = 0x8000ul - base; } - lmb_add(base, size); + do { + if (is_kexec_kdump) { + base = dt_mem_next_cell(dt_root_addr_cells, +&usm); + size = dt_mem_next_cell(dt_root_size_cells, +&usm); + } + if (iommu_is_off) { + if (base >= 0x8000ul) + continue; + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + } + lmb_add(base, size); + } while (--rngs); } lmb_dump_all(); return 0; diff -Naurp powerpc-orig/arch/powerpc/mm/numa.c powerpc/arch/powerpc/mm/numa.c --- powerpc-orig/arch/powerpc/mm/numa.c 2008-08-14 08:23:25.0 +0530 +++ powerpc/arch/powerpc/mm/numa.c 2008-08-14 14:35:42.0 +0530 @@ -150,6 +150,21 @@ static const int *of_get_associativity(s return of_get_property(dev, "ibm,associativity", NULL); } +/* + * Returns the property linux,drconf-usable-memory if + * it exists (the property exists only in
[PATCH 2/2][V3] kexec-tools: create a new linux, drconf-usable-memory property
Add a new linux,drconf-usable-memory property to the device tree. This property stores the usable memory regions for kexec/kdump kernel. The other changes to kexec-tools which do not affect the kernel are not attached here. These are the changes to kexec-tools. Patch 1/2 are the changes in kernel. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- kexec/arch/ppc64/fs2dt.c | 72 + 1 file changed, 72 insertions(+) --- kexec-tools-testing-orig/kexec/arch/ppc64/fs2dt.c 2008-08-14 14:41:52.0 +0530 +++ kexec-tools-testing/kexec/arch/ppc64/fs2dt.c2008-08-14 14:46:15.0 +0530 @@ -122,6 +122,74 @@ static unsigned propnum(const char *name return offset; } +static void add_dyn_reconf_usable_mem_property(int fd) +{ + char fname[MAXPATH], *bname; + uint64_t buf[32]; + uint64_t ranges[2*MAX_MEMORY_RANGES]; + uint64_t base, end, loc_base, loc_end; + int range, rlen = 0, i; + int rngs_cnt, tmp_indx; + + strcpy(fname, pathname); + bname = strrchr(fname, '/'); + bname[0] = '\0'; + bname = strrchr(fname, '/'); + if (strncmp(bname, "/ibm,dynamic-reconfiguration-memory", 36)) + return; + + if (lseek(fd, 4, SEEK_SET) < 0) + die("unrecoverable error: error seeking in \"%s\": %s\n", + pathname, strerror(errno)); + + rlen = 0; + for (i = 0; i < num_of_lmbs; i++) { + if (read(fd, buf, 24) < 0) + die("unrecoverable error: error reading \"%s\": %s\n", + pathname, strerror(errno)); + + base = (uint64_t) buf[0]; + end = base + lmb_size; + if (~0ULL - base < end) + die("unrecoverable error: mem property overflow\n"); + + tmp_indx = rlen++; + + rngs_cnt = 0; + for (range = 0; range < usablemem_rgns.size; range++) { + loc_base = usablemem_rgns.ranges[range].start; + loc_end = usablemem_rgns.ranges[range].end; + if (loc_base >= base && loc_end <= end) { + ranges[rlen++] = loc_base; + ranges[rlen++] = loc_end - loc_base; + rngs_cnt++; + } else if (base < loc_end && end > loc_base) { + if (loc_base < base) + loc_base = base; + if (loc_end > end) + loc_end = end; + ranges[rlen++] = loc_base; + ranges[rlen++] = loc_end - loc_base; + rngs_cnt++; + } + } + /* Store the count of (base, size) duple */ + ranges[tmp_indx] = rngs_cnt; + } + + rlen = rlen * sizeof(uint64_t); + /* +* Add linux,drconf-usable-memory property. +*/ + *dt++ = 3; + *dt++ = rlen; + *dt++ = propnum("linux,drconf-usable-memory"); + if ((rlen >= 8) && ((unsigned long)dt & 0x4)) + dt++; + memcpy(dt, &ranges, rlen); + dt += (rlen + 3)/4; +} + static void add_usable_mem_property(int fd, int len) { char fname[MAXPATH], *bname; @@ -267,6 +335,10 @@ static void putprops(char *fn, struct di dt += (len + 3)/4; if (!strcmp(dp->d_name, "reg") && usablemem_rgns.size) add_usable_mem_property(fd, len); + if (!strcmp(dp->d_name, "ibm,dynamic-memory") && + usablemem_rgns.size) + add_dyn_reconf_usable_mem_property(fd); + close(fd); } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/2][V3] powerpc: add support for dynamic reconfiguration memory in kexec/kdump kernels
Pls let me know the status of this patch. Thanks, Chandru On Thursday 14 August 2008 15:17:32 Chandru wrote: > kdump kernel needs to use only those memory regions that it is allowed to use > (crashkernel, rtas, tce ..etc ). Each of these regions have their own sizes > and are currently added under 'linux,usable-memory' property under each > memory@ node of the device tree. ibm,dynamic-memory property of > ibm,dynamic-reconfiguration-memory node now stores in it the representation > for most of the logical memory blocks with the size of each memory block > being a constant (lmb_size). If one or more or part of the above mentioned > regions lie under one of the lmb from ibm,dynamic-memory property, there is a > need to identify those regions within the given lmb. Following patch > recognizes a new property 'linux,drconf-usable-memory' property added by > kexec-tools. Each entry in this property is of the form 'a counter' followed > by those many (base, size) duple for the above mentioned regions. > > Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> > --- > > These patches were sent earlier but these are V3 of the patches. Pls let me > know your thoughts. Thanks. > > arch/powerpc/kernel/prom.c | 40 +++-- > arch/powerpc/mm/numa.c | 79 +++ > 2 files changed, 96 insertions(+), 23 deletions(-) > > diff -Naurp powerpc-orig/arch/powerpc/kernel/prom.c > powerpc/arch/powerpc/kernel/prom.c > --- powerpc-orig/arch/powerpc/kernel/prom.c 2008-08-14 08:23:25.0 > +0530 > +++ powerpc/arch/powerpc/kernel/prom.c2008-08-14 14:35:24.0 > +0530 > @@ -888,9 +888,10 @@ static u64 __init dt_mem_next_cell(int s > */ > static int __init early_init_dt_scan_drconf_memory(unsigned long node) > { > - cell_t *dm, *ls; > + cell_t *dm, *ls, *usm; > unsigned long l, n, flags; > u64 base, size, lmb_size; > + unsigned int is_kexec_kdump = 0, rngs; > > ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); > if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) > @@ -905,6 +906,12 @@ static int __init early_init_dt_scan_drc > if (l < (n * (dt_root_addr_cells + 4) + 1) * sizeof(cell_t)) > return 0; > > + /* check if this is a kexec/kdump kernel. */ > + usm = (cell_t *)of_get_flat_dt_prop(node, "linux,drconf-usable-memory", > + &l); > + if (usm != NULL) > + is_kexec_kdump = 1; > + > for (; n != 0; --n) { > base = dt_mem_next_cell(dt_root_addr_cells, &dm); > flags = dm[3]; > @@ -915,13 +922,34 @@ static int __init early_init_dt_scan_drc > if ((flags & 0x80) || !(flags & 0x8)) > continue; > size = lmb_size; > - if (iommu_is_off) { > - if (base >= 0x8000ul) > + rngs = 1; > + if (is_kexec_kdump) { > + /* > + * For each lmb in ibm,dynamic-memory, a corresponding > + * entry in linux,drconf-usable-memory property contains > + * a counter 'p' followed by 'p' (base, size) duple. > + * Now read the counter from > + * linux,drconf-usable-memory property > + */ > + rngs = dt_mem_next_cell(dt_root_size_cells, &usm); > + if (!rngs) /* there are no (base, size) duple */ > continue; > - if ((base + size) > 0x8000ul) > - size = 0x8000ul - base; > } > - lmb_add(base, size); > + do { > + if (is_kexec_kdump) { > + base = dt_mem_next_cell(dt_root_addr_cells, > + &usm); > + size = dt_mem_next_cell(dt_root_size_cells, > + &usm); > + } > + if (iommu_is_off) { > + if (base >= 0x8000ul) > + continue; > + if ((base + size) > 0x8000ul) > + size = 0x8000ul - base; > + } > + lmb_add(base, size); > + } while (--rngs); > } > lmb_dump_all(); > retur
[PATCH][V3] powerpc: add support for dynamic reconfiguration memory in kexec/kdump kernels
kdump kernel needs to use only those memory regions that it is allowed to use (crashkernel, rtas, tce ..etc ). Each of these regions have their own sizes and are currently added under 'linux,usable-memory' property under each memory@ node of the device tree. ibm,dynamic-memory property of ibm,dynamic-reconfiguration-memory node (on power6) now stores in it the representation for most of the logical memory blocks with the size of each memory block being a constant (lmb_size). If one or more or part of the above mentioned regions lie under one of the lmb from ibm,dynamic-memory property, there is a need to identify those regions within the given lmb. Following patch recognizes a new property 'linux,drconf-usable-memory' property added by kexec-tools. Each entry in this property is of the form 'a counter' followed by those many (base, size) duple for the above mentioned regions. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- Patch was tested on a power6 box. Could you pls add this patch to powerpc git tree ? . Thanks, arch/powerpc/kernel/prom.c | 40 +++-- arch/powerpc/mm/numa.c | 79 +++ 2 files changed, 96 insertions(+), 23 deletions(-) diff -Naurp powerpc-orig/arch/powerpc/kernel/prom.c powerpc/arch/powerpc/kernel/prom.c --- powerpc-orig/arch/powerpc/kernel/prom.c 2008-08-14 08:23:25.0 +0530 +++ powerpc/arch/powerpc/kernel/prom.c 2008-08-14 14:35:24.0 +0530 @@ -888,9 +888,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; + cell_t *dm, *ls, *usm; unsigned long l, n, flags; u64 base, size, lmb_size; + unsigned int is_kexec_kdump = 0, rngs; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -905,6 +906,12 @@ static int __init early_init_dt_scan_drc if (l < (n * (dt_root_addr_cells + 4) + 1) * sizeof(cell_t)) return 0; + /* check if this is a kexec/kdump kernel. */ + usm = (cell_t *)of_get_flat_dt_prop(node, "linux,drconf-usable-memory", +&l); + if (usm != NULL) + is_kexec_kdump = 1; + for (; n != 0; --n) { base = dt_mem_next_cell(dt_root_addr_cells, &dm); flags = dm[3]; @@ -915,13 +922,34 @@ static int __init early_init_dt_scan_drc if ((flags & 0x80) || !(flags & 0x8)) continue; size = lmb_size; - if (iommu_is_off) { - if (base >= 0x8000ul) + rngs = 1; + if (is_kexec_kdump) { + /* +* For each lmb in ibm,dynamic-memory, a corresponding +* entry in linux,drconf-usable-memory property contains +* a counter 'p' followed by 'p' (base, size) duple. +* Now read the counter from +* linux,drconf-usable-memory property +*/ + rngs = dt_mem_next_cell(dt_root_size_cells, &usm); + if (!rngs) /* there are no (base, size) duple */ continue; - if ((base + size) > 0x8000ul) - size = 0x8000ul - base; } - lmb_add(base, size); + do { + if (is_kexec_kdump) { + base = dt_mem_next_cell(dt_root_addr_cells, +&usm); + size = dt_mem_next_cell(dt_root_size_cells, +&usm); + } + if (iommu_is_off) { + if (base >= 0x8000ul) + continue; + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + } + lmb_add(base, size); + } while (--rngs); } lmb_dump_all(); return 0; diff -Naurp powerpc-orig/arch/powerpc/mm/numa.c powerpc/arch/powerpc/mm/numa.c --- powerpc-orig/arch/powerpc/mm/numa.c 2008-08-14 08:23:25.0 +0530 +++ powerpc/arch/powerpc/mm/numa.c 2008-08-14 14:35:42.0 +0530 @@ -150,6 +150,21 @@ static const int *of_get_associativity(s return of_get_property(dev, "ibm,associativity", NULL); } +/* + * Returns the property linux,drconf-usable-memory if + * it exists (the proper
device tree in open firmware on power6
Hi, When I set linux 2.6.26-rc1 as default kernel to boot in /etc/yaboot.conf, then the device tree in open firmware shows only one memory node ( the same memory node appears in /proc/device-tree/[EMAIL PROTECTED] ). But when RHEL5.2 kernel is set as default in /etc/yaboot.conf then the device tree in open firmware shows plenty of memory nodes. Following is the open firmware output.. linux-2.6.26-rc1: 0 > dev / ls ... 00caf1b8: /PowerPC,[EMAIL PROTECTED] 00cb0120: /[EMAIL PROTECTED] 00cb83d8: /ibm,dynamic-reconfiguration-memory 00cbcd60: /options ... 0 > dev /[EMAIL PROTECTED] ok 0 > .properties namememory device_type memory reg 0800 available 4000 00bfc000 0202 05fe #address-cells 0001 #size-cells ibm,phandle fffa ibm,associativity 0004 when default=RHEL5.2: 0 > dev / ls 00c8d200: /ibm,serial 00c8dff8: /chosen ... 00cae2c0: /PowerPC,[EMAIL PROTECTED] 00caf1b8: /PowerPC,[EMAIL PROTECTED] 00cb0120: /[EMAIL PROTECTED] 00cb5af0: /[EMAIL PROTECTED] 00cb5ce0: /[EMAIL PROTECTED] 00cb5ed0: /[EMAIL PROTECTED] 00cb60c0: /[EMAIL PROTECTED] 00cb62b0: /[EMAIL PROTECTED] 00cb64a0: /[EMAIL PROTECTED] 00cb6690: /[EMAIL PROTECTED] ... The open firmware environment variable "ibm,fw-new-mem-def" is false for rhel5.2 kernel where as it is 'true' for 2.6.26-rc1 as default kernel to boot. Any inputs if the one memory node in 2.6.26-rc1 should show the size of available system memory ?, or there should be many memory nodes for 2.6.26-rc1 ?. Thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC][PATCH] powerpc: add usable-memory property to drconf memory
kexec-tools in user space collects rtas, crash , tce region ranges from /proc/device-tree/ and passes them as linux,usable-memory properties to second/kdump kernel in the device tree buffer. With drconf memory in power6 machines, we need a similar method to preserve those regions in the kdump kernel. This patch adopts a method to append the number 'n' from n'th lmb entry in ibm,dynamic-memory as a string to "linux,usable-memory" string , so as to distinguish each usable-memory region and it's size. Another place that needs similar change but not included in this patch is in arch/powerpc/mm/numa.c. Signed-off-by: Chandru S <[EMAIL PROTECTED]> --- --- arch/powerpc/kernel/prom.c.orig 2008-06-13 11:42:45.0 +0530 +++ arch/powerpc/kernel/prom.c 2008-06-13 12:09:04.0 +0530 @@ -884,9 +884,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; - unsigned long l, n, flags; + cell_t *dm, *ls, *udm, *reg, *endp; + unsigned long l, n, un, flags; u64 base, size, lmb_size; + char buf[64], t[8]; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -919,6 +920,41 @@ static int __init early_init_dt_scan_drc } lmb_add(base, size); } + + /* Scan usable-memory properties */ + for (; un != 0; --un) { + base = dt_mem_next_cell(dt_root_addr_cells, &udm); + flags = dm[3]; + udm += 4; + if ((flags & 0x80) || !(flags & 0x8)) + continue; + strcpy(buf, "linux,usable-memory"); + sprintf(t, "%d", (int)un); + strcat(buf, t); + reg = (cell_t *)of_get_flat_dt_prop(node, + (const char *)buf, &l); + if (reg == NULL) + continue; + /* remove the previously added lmb */ + lmb_remove(base, (base+lmb_size)); + endp = reg + (l / sizeof(cell_t)); + while ((endp - reg) >= (dt_root_addr_cells + + dt_root_size_cells)) { + + base = dt_mem_next_cell(dt_root_addr_cells, ®); + size = dt_mem_next_cell(dt_root_size_cells, ®); + + if (size == 0) + continue; + if (iommu_is_off) { + if (base >= 0x8000ul) + continue; + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + } + lmb_add(base, size); + } + } lmb_dump_all(); return 0; } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC][PATCH] powerpc: add usable-memory property to drconf memory
On Friday 13 June 2008 14:51:39 Chandru wrote: > kexec-tools in user space collects rtas, crash , tce region ranges > from /proc/device-tree/ and passes them as linux,usable-memory properties > to second/kdump kernel in the device tree buffer. With drconf memory in > power6 machines, we need a similar method to preserve those regions in the > kdump kernel. This patch adopts a method to append the number 'n' from > n'th lmb entry in ibm,dynamic-memory as a string to "linux,usable-memory" > string , so as to distinguish each usable-memory region and it's size. > Another place that needs similar change but not included in this patch is > in > arch/powerpc/mm/numa.c. The previous patch had two missing lines. Hence sending it again. Sorry about this :(. Signed-off-by: Chandru S <[EMAIL PROTECTED]> --- --- arch/powerpc/kernel/prom.c.orig 2008-06-13 11:42:45.0 +0530 +++ arch/powerpc/kernel/prom.c 2008-06-13 16:21:04.0 +0530 @@ -884,20 +884,22 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; - unsigned long l, n, flags; + cell_t *dm, *ls, *udm, *reg, *endp; + unsigned long l, n, un, flags; u64 base, size, lmb_size; + char buf[64], t[8]; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) return 0; lmb_size = dt_mem_next_cell(dt_root_size_cells, &ls); - dm = (cell_t *)of_get_flat_dt_prop(node, "ibm,dynamic-memory", &l); + udm = dm = (cell_t *)of_get_flat_dt_prop(node, + "ibm,dynamic-memory", &l); if (dm == NULL || l < sizeof(cell_t)) return 0; - n = *dm++; /* number of entries */ + un = n = *dm++; /* number of entries */ if (l < (n * (dt_root_addr_cells + 4) + 1) * sizeof(cell_t)) return 0; @@ -919,6 +921,41 @@ static int __init early_init_dt_scan_drc } lmb_add(base, size); } + + /* Scan usable-memory properties */ + for (; un != 0; --un) { + base = dt_mem_next_cell(dt_root_addr_cells, &udm); + flags = dm[3]; + udm += 4; + if ((flags & 0x80) || !(flags & 0x8)) + continue; + strcpy(buf, "linux,usable-memory"); + sprintf(t, "%d", (int)un); + strcat(buf, t); + reg = (cell_t *)of_get_flat_dt_prop(node, + (const char *)buf, &l); + if (reg == NULL) + continue; + /* remove the previously added lmb */ + lmb_remove(base, (base+lmb_size)); + endp = reg + (l / sizeof(cell_t)); + while ((endp - reg) >= (dt_root_addr_cells + + dt_root_size_cells)) { + + base = dt_mem_next_cell(dt_root_addr_cells, ®); + size = dt_mem_next_cell(dt_root_size_cells, ®); + + if (size == 0) + continue; + if (iommu_is_off) { + if (base >= 0x8000ul) + continue; + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + } + lmb_add(base, size); + } + } lmb_dump_all(); return 0; } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
Following is a patch series that contains changes for kexec/kdump on Power6 machines. Power6 machines have most of their memory represented in /proc/device-tree/ibm,dynamic-reconfiguration-memory node. kexec-tools currently read only memory@ nodes of device-tree. Patch 1/4 contains changes in the kernel. Patch {2,3,4}/4 are changes for kexec-tools. The kernel changes are similar to what was earlier present for memory@ nodes for 'linux,usable-memory' property. Unlike memory@ nodes, since now there is one node under which most of the memory ranges are represented ( ibm,dynamic-memory) , I have appended the lmb entry number to the 'linux,usable-memory' string so as to identify usable-memory ranges in the device-tree. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
kexec-tools adds crash, rtas, and tce memory regions as linux,usable-memory properties in device-tree. Following changes are made in the kernel to recognize these special properties in case of ibm,dynamic-reconfiguration-memory node of device-tree. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- diff -Naurp linux-2.6.26-rc9-orig/arch/powerpc/kernel/prom.c linux-2.6.26-rc9/arch/powerpc/kernel/prom.c --- linux-2.6.26-rc9-orig/arch/powerpc/kernel/prom.c2008-07-06 04:23:22.0 +0530 +++ linux-2.6.26-rc9/arch/powerpc/kernel/prom.c 2008-07-07 17:23:58.0 +0530 @@ -884,9 +884,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; + cell_t *dm, *ls, *endp, *usm; unsigned long l, n, flags; u64 base, size, lmb_size; + char buf[32], t[8]; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -917,7 +918,33 @@ static int __init early_init_dt_scan_drc if ((base + size) > 0x8000ul) size = 0x8000ul - base; } - lmb_add(base, size); + strcpy(buf, "linux,usable-memory"); + sprintf(t, "%d", (int)n); + strcat(buf, t); + usm = (cell_t *)of_get_flat_dt_prop(node, +(const char *)buf, &l); + if (usm != NULL) { + endp = usm + (l / sizeof(cell_t)); + while ((endp - usm) >= (dt_root_addr_cells + +dt_root_size_cells)) { + base = dt_mem_next_cell(dt_root_addr_cells, +&usm); + size = dt_mem_next_cell(dt_root_size_cells, +&usm); + if (size == 0) + continue; + if (iommu_is_off) { + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + } + lmb_add(base, size); + } + + /* Continue with next lmb entry */ + continue; + } else { + lmb_add(base, size); + } } lmb_dump_all(); return 0; diff -Naurp linux-2.6.26-rc9-orig/arch/powerpc/mm/numa.c linux-2.6.26-rc9/arch/powerpc/mm/numa.c --- linux-2.6.26-rc9-orig/arch/powerpc/mm/numa.c2008-07-06 04:23:22.0 +0530 +++ linux-2.6.26-rc9/arch/powerpc/mm/numa.c 2008-07-07 17:50:35.0 +0530 @@ -349,18 +349,33 @@ static unsigned long __init numa_enforce return lmb_end_of_DRAM() - start; } +static void set_nodeinfo(int nid, unsigned long start, unsigned long size) +{ + fake_numa_create_new_node(((start + size) >> PAGE_SHIFT), + &nid); + node_set_online(nid); + + size = numa_enforce_memory_limit(start, size); + if (!size) + return; + add_active_range(nid, start >> PAGE_SHIFT, + (start >> PAGE_SHIFT) + (size >> PAGE_SHIFT)); + return; +} + /* * Extract NUMA information from the ibm,dynamic-reconfiguration-memory * node. This assumes n_mem_{addr,size}_cells have been set. */ static void __init parse_drconf_memory(struct device_node *memory) { - const unsigned int *lm, *dm, *aa; + const unsigned int *lm, *dm, *aa, *usm; unsigned int ls, ld, la; unsigned int n, aam, aalen; unsigned long lmb_size, size, start; int nid, default_nid = 0; - unsigned int ai, flags; + unsigned int ai, flags, len, ranges; + char buf[32], t[8]; lm = of_get_property(memory, "ibm,lmb-size", &ls); dm = of_get_property(memory, "ibm,dynamic-memory", &ld); @@ -396,16 +411,27 @@ static void __init parse_drconf_memory(s nid = default_nid; } - fake_numa_create_new_node(((start + lmb_size) >> PAGE_SHIFT), - &nid); - node_set_online(nid); + strcpy(buf, "linux,usable-memory"); + sprintf(t, "%d", (int)n); + strcat(buf, t); + usm = of_get_property(memory, (const char *)buf, &len); + if (usm != NULL) { + ranges = (len >> 2)
[PATCH 2/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
Add linux,usable-memory properties into device tree in case of ibm,dynamic-reconfiguration-memory node of /proc/device-tree Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- diff -Naurp kexec-tools-testing-orig/kexec/arch/ppc64/fs2dt.c kexec-tools-testing/kexec/arch/ppc64/fs2dt.c --- kexec-tools-testing-orig/kexec/arch/ppc64/fs2dt.c 2008-07-07 17:54:58.0 +0530 +++ kexec-tools-testing/kexec/arch/ppc64/fs2dt.c2008-07-07 20:08:53.0 +0530 @@ -122,6 +122,80 @@ static unsigned propnum(const char *name return offset; } +static void add_dyn_reconf_usable_mem_property(int fd) +{ + char fname[MAXPATH], propname[64], t[8], *bname; + const char tbuf[32]; + uint64_t buf[32]; + uint64_t ranges[2*MAX_MEMORY_RANGES]; + uint64_t base, end, loc_base, loc_end; + int range, rlen = 0, i; + + strcpy(fname, pathname); + bname = strrchr(fname, '/'); + bname[0] = '\0'; + bname = strrchr(fname, '/'); + if (strncmp(bname, "/ibm,dynamic-reconfiguration-memory", 36)) + return; + + if (lseek(fd, 4, SEEK_SET) < 0) + die("unrecoverable error: error seeking in \"%s\": %s\n", + pathname, strerror(errno)); + + /* kernel counts the lmb's from 1, hence use i=1 here */ + for (i = 1; i <= num_of_lmbs; i++) { + rlen = 0; + if (read(fd, buf, 24) < 0) + die("unrecoverable error: error reading \"%s\": %s\n", + pathname, strerror(errno)); + + base = (uint64_t) buf[0]; + end = base + lmb_size; + if (~0ULL - base < end) + die("unrecoverable error: mem property overflow\n"); + + for (range = 0; range < usablemem_rgns.size; range++) { + loc_base = usablemem_rgns.ranges[range].start; + loc_end = usablemem_rgns.ranges[range].end; + if (loc_base >= base && loc_end <= end) { + ranges[rlen++] = loc_base; + ranges[rlen++] = loc_end - loc_base; + } else if (base < loc_end && end > loc_base) { + if (loc_base < base) + loc_base = base; + if (loc_end > end) + loc_end = end; + ranges[rlen++] = loc_base; + ranges[rlen++] = loc_end - loc_base; + } + } + if (!rlen) { + /* +* User did not pass any ranges for this region. +* Hence, write (0,0) duple in linux,usable-memory +* property such that this region will be ignored. +*/ + ranges[rlen++] = 0; + ranges[rlen++] = 0; + } + + rlen = rlen * sizeof(uint64_t); + /* +* Add linux,usable-memory property. +*/ + *dt++ = 3; + *dt++ = rlen; + strcpy(tbuf, "linux,usable-memory"); + sprintf(t, "%d", i); + strcat(tbuf, t); + *dt++ = propnum((const char *)tbuf); + if ((rlen >= 8) && ((unsigned long)dt & 0x4)) + dt++; + memcpy(dt, &ranges, rlen); + dt += (rlen + 3)/4; + } +} + static void add_usable_mem_property(int fd, int len) { char fname[MAXPATH], *bname; @@ -267,6 +341,10 @@ static void putprops(char *fn, struct di dt += (len + 3)/4; if (!strcmp(dp->d_name, "reg") && usablemem_rgns.size) add_usable_mem_property(fd, len); + if (!strcmp(dp->d_name, "ibm,dynamic-memory") && + usablemem_rgns.size) + add_dyn_reconf_usable_mem_property(fd); + close(fd); } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
get crash memory ranges from ibm,dynamic-reconfiguration-memory node of /proc/device-tree. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- diff -Naurp kexec-tools-testing-orig/kexec/arch/ppc64/crashdump-ppc64.c kexec-tools-testing/kexec/arch/ppc64/crashdump-ppc64.c --- kexec-tools-testing-orig/kexec/arch/ppc64/crashdump-ppc64.c 2008-07-07 17:54:58.0 +0530 +++ kexec-tools-testing/kexec/arch/ppc64/crashdump-ppc64.c 2008-07-07 20:14:02.0 +0530 @@ -84,6 +84,82 @@ mem_rgns_t usablemem_rgns = {0, NULL}; */ uint64_t saved_max_mem = 0; +static unsigned long long cstart, cend; +static int memory_ranges; + +/* + * Exclude the region that lies within crashkernel + */ +static void exclude_crash_region(uint64_t start, uint64_t end) +{ + if (cstart < end && cend > start) { + if (start < cstart && end > cend) { + crash_memory_range[memory_ranges].start = start; + crash_memory_range[memory_ranges].end = cstart; + crash_memory_range[memory_ranges].type = RANGE_RAM; + memory_ranges++; + crash_memory_range[memory_ranges].start = cend; + crash_memory_range[memory_ranges].end = end; + crash_memory_range[memory_ranges].type = RANGE_RAM; + memory_ranges++; + } else if (start < cstart) { + crash_memory_range[memory_ranges].start = start; + crash_memory_range[memory_ranges].end = cstart; + crash_memory_range[memory_ranges].type = RANGE_RAM; + memory_ranges++; + } else if (end > cend) { + crash_memory_range[memory_ranges].start = cend; + crash_memory_range[memory_ranges].end = end; + crash_memory_range[memory_ranges].type = RANGE_RAM; + memory_ranges++; + } + } else { + crash_memory_range[memory_ranges].start = start; + crash_memory_range[memory_ranges].end = end; + crash_memory_range[memory_ranges].type = RANGE_RAM; + memory_ranges++; + } +} + +static int get_dyn_reconf_crash_memory_ranges() +{ + uint64_t start, end; + char fname[128], buf[32]; + FILE *file; + int i, n; + + strcpy(fname, "/proc/device-tree/"); + strcat(fname, "ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory"); + if ((file = fopen(fname, "r")) == NULL) { + perror(fname); + return -1; + } + + fseek(file, 4, SEEK_SET); + for (i = 0; i < num_of_lmbs; i++) { + if ((n = fread(buf, 1, 24, file)) < 0) { + perror(fname); + fclose(file); + return -1; + } + if (memory_ranges >= (max_memory_ranges + 1)) { + /* No space to insert another element. */ + fprintf(stderr, + "Error: Number of crash memory ranges" + " excedeed the max limit\n"); + return -1; + } + + start = ((uint64_t *)buf)[0]; + end = start + lmb_size; + if (start == 0 && end >= (BACKUP_SRC_END + 1)) + start = BACKUP_SRC_END + 1; + exclude_crash_region(start, end); + } + fclose(file); + return 0; +} + /* Reads the appropriate file and retrieves the SYSTEM RAM regions for whom to * create Elf headers. Keeping it separate from get_memory_ranges() as * requirements are different in the case of normal kexec and crashdumps. @@ -98,7 +174,6 @@ uint64_t saved_max_mem = 0; static int get_crash_memory_ranges(struct memory_range **range, int *ranges) { - int memory_ranges = 0; char device_tree[256] = "/proc/device-tree/"; char fname[256]; char buf[MAXBYTES]; @@ -106,7 +181,7 @@ static int get_crash_memory_ranges(struc FILE *file; struct dirent *dentry, *mentry; int i, n, crash_rng_len = 0; - unsigned long long start, end, cstart, cend; + unsigned long long start, end; int page_size; crash_max_memory_ranges = max_memory_ranges + 6; @@ -129,7 +204,16 @@ static int get_crash_memory_ranges(struc perror(device_tree); goto err; } + + cstart = crash_base; + cend = crash_base + crash_size; + while ((dentry = readdir(dir)) != NULL) { + if (!strncmp(dentry->d_name, + "ibm,dynamic-reconfiguration-memory", 35)){ + get_dyn_recon
[PATCH 4/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
Changes to kexec-ppc64.c in case of ibm,dynamic-reconfiguration-memory node of /proc/device-tree. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- diff -Naurp kexec-tools-testing-orig/kexec/arch/ppc64/kexec-ppc64.c kexec-tools-testing/kexec/arch/ppc64/kexec-ppc64.c --- kexec-tools-testing-orig/kexec/arch/ppc64/kexec-ppc64.c 2008-07-07 17:54:58.0 +0530 +++ kexec-tools-testing/kexec/arch/ppc64/kexec-ppc64.c 2008-07-07 20:04:36.0 +0530 @@ -96,6 +96,49 @@ err1: } +static int count_dyn_reconf_memory_ranges(void) +{ + char device_tree[] = "/proc/device-tree/"; + char fname[128]; + char buf[32]; + FILE *file; + + strcpy(fname, device_tree); + strcat(fname, "ibm,dynamic-reconfiguration-memory/ibm,lmb-size"); + if ((file = fopen(fname, "r")) == NULL) { + perror(fname); + return -1; + } + + if (fread(buf, 1, 8, file) < 0) { + perror(fname); + fclose(file); + return -1; + } + + lmb_size = ((uint64_t *)buf)[0]; + fclose(file); + + /* Get number of lmbs from ibm,dynamic-memory */ + strcpy(fname, device_tree); + strcat(fname, "ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory"); + if ((file = fopen(fname, "r")) == NULL) { + perror(fname); + return -1; + } + /* +* first 4 bytes provide number of entries(lmbs) +*/ + if (fread(buf, 1, 4, file) < 0) { + perror(fname); + fclose(file); + return -1; + } + num_of_lmbs = ((unsigned int *)buf)[0]; + max_memory_ranges += num_of_lmbs; + fclose(file); +} + /* * Count the memory nodes under /proc/device-tree and populate the * max_memory_ranges variable. This variable replaces MAX_MEMORY_RANGES @@ -113,6 +156,12 @@ static int count_memory_ranges(void) } while ((dentry = readdir(dir)) != NULL) { + if (!strncmp(dentry->d_name, + "ibm,dynamic-reconfiguration-memory", 35)){ + count_dyn_reconf_memory_ranges(); + continue; + } + if (strncmp(dentry->d_name, "memory@", 7) && strcmp(dentry->d_name, "memory") && strncmp(dentry->d_name, "pci@", 4)) @@ -128,7 +177,52 @@ static int count_memory_ranges(void) return 0; } +static void add_base_memory_range(uint64_t start, uint64_t end) +{ + base_memory_range[nr_memory_ranges].start = start; + base_memory_range[nr_memory_ranges].end = end; + base_memory_range[nr_memory_ranges].type = RANGE_RAM; + nr_memory_ranges++; + + dbgprintf("%016llx-%016llx : %x\n", + base_memory_range[nr_memory_ranges-1].start, + base_memory_range[nr_memory_ranges-1].end, + base_memory_range[nr_memory_ranges-1].type); +} + +static int get_dyn_reconf_base_ranges(void) +{ + uint64_t start, end; + char fname[128], buf[32]; + FILE *file; + int i, n; + + strcpy(fname, "/proc/device-tree/"); + strcat(fname, + "ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory"); + if ((file = fopen(fname, "r")) == NULL) { + perror(fname); + return -1; + } + + fseek(file, 4, SEEK_SET); + for (i = 0; i < num_of_lmbs; i++) { + if ((n = fread(buf, 1, 24, file)) < 0) { + perror(fname); + fclose(file); + return -1; + } + if (nr_memory_ranges >= max_memory_ranges) + return -1; + + start = ((uint64_t *)buf)[0]; + end = start + lmb_size; + add_base_memory_range(start, end); + } + fclose(file); + return 0; +} /* Sort the base ranges in memory - this is useful for ensuring that our * ranges are in ascending order, even if device-tree read of memory nodes * is done differently. Also, could be used for other range coalescing later @@ -156,7 +250,7 @@ static int sort_base_ranges(void) /* Get base memory ranges */ static int get_base_ranges(void) { - int local_memory_ranges = 0; + uint64_t start, end; char device_tree[256] = "/proc/device-tree/"; char fname[256]; char buf[MAXBYTES]; @@ -170,6 +264,11 @@ static int get_base_ranges(void) return -1; } while ((dentry = readdir(dir)) != NULL) { + if (!strncmp(dentry->d_name, + "ibm,dynamic-reconfiguration-memory", 35)) { +
Re: [PATCH 1/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
Thanks for the review comments. I will change to 'snprintf' at all the places and remove the unnecessary casts. On Tuesday 08 July 2008 07:06:46 Stephen Rothwell wrote: > Hi Chandru, > > On Tue, 8 Jul 2008 00:14:24 +0530 Chandru <[EMAIL PROTECTED]> wrote: > > + if (usm != NULL) { > > + ranges = (len >> 2) / (n_mem_addr_cells + > > ^^ > len / sizeof(u32) ? ranges is made to count the number of linux,usable-memory properties in the device tree. It's acting as a counter here. So the expression in the patch is fine. Thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/4] kdump : add support for ibm, dynamic-reconfiguration-memory for kexec/kdump
On Friday 11 July 2008 03:57:53 Nathan Fontenot wrote: > Hello Chandru, > > > static int __init early_init_dt_scan_drconf_memory(unsigned long node) > > { > > - cell_t *dm, *ls; > > + cell_t *dm, *ls, *endp, *usm; > > unsigned long l, n, flags; > > u64 base, size, lmb_size; > > + char buf[32], t[8]; > > > > ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); > > if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) > > @@ -917,7 +918,33 @@ static int __init early_init_dt_scan_drc > > if ((base + size) > 0x8000ul) > > size = 0x8000ul - base; > > } > > - lmb_add(base, size); > > + strcpy(buf, "linux,usable-memory"); > > + sprintf(t, "%d", (int)n); > > + strcat(buf, t); > > + usm = (cell_t *)of_get_flat_dt_prop(node, > > +(const char *)buf, &l); > > + if (usm != NULL) { > > + endp = usm + (l / sizeof(cell_t)); > > + while ((endp - usm) >= (dt_root_addr_cells + > > +dt_root_size_cells)) { > > + base = dt_mem_next_cell(dt_root_addr_cells, > > +&usm); > > + size = dt_mem_next_cell(dt_root_size_cells, > > +&usm); > > + if (size == 0) > > + continue; > > + if (iommu_is_off) { > > + if ((base + size) > 0x8000ul) > > + size = 0x8000ul - base; > > + } > > + lmb_add(base, size); > > + } > > + > > + /* Continue with next lmb entry */ > > + continue; > > + } else { > > + lmb_add(base, size); > > + } > > } > > I am still digging through the kexec tools but I don't think you want > the processing of the linux,usable-memory property inside of the > for (; n!= 0; --n) loop. This should be moved up so that it looks for > the linux,usable-memory property and parses it, then if it is not found > look for the ibm,dynamic-reconfiguration-memory property and parse it. > > There is no need to look for the linux-usable-memory property every time > a piece of the ibm,dynamic-reconfiguration-memory property is parsed. > > -Nathan Hello Nathan, Thanks for the review. kexec-tools adds a 'linux,usable-memory' property for each memory range listed in the device tree. If the regions are not crashkernel or rtas or tce regions, then it sets base and size to zero but still adds them as linux,usable-memory property. If we look at the code above, in a kdump kernel we don't add an lmb through lmb_add() if the regions are not one of the mentioned above. We check for (size == 0) and continue with next lmb if it is so. We still have the complete device tree which kexec-tools passes in as the buffer that we are scanning here and linux,usable-memory properties aid in making kdump kernel to see only those memory regions that it is suppose to use. I just worked on another version of this patch based on comments from Michael Neuling and Stephen Rothwell. I will post it shortly. Thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/4][V2] powerpc : add support for linux, usable-memory properties for drconf memory
Scan for linux,usable-memory properties in case of dynamic reconfiguration memory. Support for kexec/kdump. Signed-off-by: Chandru Siddalingappa <[EMAIL PROTECTED]> --- Patch applies on linux-next tree (patch-v2.6.26-rc9-next-20080711.gz) arch/powerpc/kernel/prom.c | 40 +++-- arch/powerpc/mm/numa.c | 48 --- 2 files changed, 65 insertions(+), 23 deletions(-) diff -Naurp linux-2.6.26-rc9-orig/arch/powerpc/kernel/prom.c linux-2.6.26-rc9/arch/powerpc/kernel/prom.c --- linux-2.6.26-rc9-orig/arch/powerpc/kernel/prom.c2008-07-11 14:44:55.0 +0530 +++ linux-2.6.26-rc9/arch/powerpc/kernel/prom.c 2008-07-11 14:58:26.0 +0530 @@ -888,9 +888,10 @@ static u64 __init dt_mem_next_cell(int s */ static int __init early_init_dt_scan_drconf_memory(unsigned long node) { - cell_t *dm, *ls; - unsigned long l, n, flags; + cell_t *dm, *ls, *usm; + unsigned long l, n, flags, ranges; u64 base, size, lmb_size; + char buf[32]; ls = (cell_t *)of_get_flat_dt_prop(node, "ibm,lmb-size", &l); if (ls == NULL || l < dt_root_size_cells * sizeof(cell_t)) @@ -914,14 +915,37 @@ static int __init early_init_dt_scan_drc or if the block is not assigned to this partition (0x8) */ if ((flags & 0x80) || !(flags & 0x8)) continue; - size = lmb_size; - if (iommu_is_off) { + if (iommu_is_off) if (base >= 0x8000ul) continue; - if ((base + size) > 0x8000ul) - size = 0x8000ul - base; - } - lmb_add(base, size); + size = lmb_size; + + /* +* Append 'n' to 'linux,usable-memory' to get special +* properties passed in by tools like kexec-tools. Relevant +* only if this is a kexec/kdump kernel. +*/ + sprintf(buf, "linux,usable-memory%d", (int)n); + usm = of_get_flat_dt_prop(node, buf, &l); + ranges = 1; + if (usm != NULL) + ranges = (l >> 2)/(dt_root_addr_cells + + dt_root_size_cells); + do { + if (usm != NULL) { + base = dt_mem_next_cell(dt_root_addr_cells, +&usm); + size = dt_mem_next_cell(dt_root_size_cells, +&usm); + if (size == 0) + break; + } + if (iommu_is_off) + if ((base + size) > 0x8000ul) + size = 0x8000ul - base; + + lmb_add(base, size); + } while (--ranges); } lmb_dump_all(); return 0; diff -Naurp linux-2.6.26-rc9-orig/arch/powerpc/mm/numa.c linux-2.6.26-rc9/arch/powerpc/mm/numa.c --- linux-2.6.26-rc9-orig/arch/powerpc/mm/numa.c2008-07-11 14:44:55.0 +0530 +++ linux-2.6.26-rc9/arch/powerpc/mm/numa.c 2008-07-11 15:01:56.0 +0530 @@ -493,11 +493,13 @@ static unsigned long __init numa_enforce */ static void __init parse_drconf_memory(struct device_node *memory) { - const u32 *dm; - unsigned int n, rc; - unsigned long lmb_size, size; + const u32 *dm, *usm; + unsigned int n, rc, len, ranges; + unsigned long lmb_size, size, sz; int nid; struct assoc_arrays aa; + char buf[32]; + u64 base; n = of_get_drconf_memory(memory, &dm); if (!n) @@ -524,19 +526,35 @@ static void __init parse_drconf_memory(s nid = of_drconf_to_nid_single(&drmem, &aa); - fake_numa_create_new_node( - ((drmem.base_addr + lmb_size) >> PAGE_SHIFT), - &nid); - - node_set_online(nid); - - size = numa_enforce_memory_limit(drmem.base_addr, lmb_size); - if (!size) - continue; + /* +* Append 'n' to 'linux,usable-memory' to get special +* properties passed in by tools like kexec-tools. Relevant +* only if this is a kexec/kdump kernel. +*/ + sprintf(buf, "linux,usable-memory%d", (int)n); + usm = of_get_property(memory, buf, &len); + ranges = 1; + if (usm != NULL) +
kernel panics with crashkernel=256M while booting
0> 3b8100d8 a3ad000a e89e8028 Rebooting in 180 seconds.. = When booted with crashkernel=2...@32m or any memory size less than this, the system boots properly. The following was the observation.. The system comes up with two nodes (0-256M and 256M-4GB). The crashkernel memory reservation spans across these two nodes. The mark_reserved_regions_for_nid() in arch/powerpc/mm/numa.c resizes the reserved part of the memory within it as... ... ... if (end_pfn > node_ar.end_pfn) reserve_size = (node_ar.end_pfn << PAGE_SHIFT) - (start_pfn << PAGE_SHIFT); but the reserve_bootmem_node() in mm/bootmem.c raises the pfn value of end end = PFN_UP(physaddr + size); This causes end to get a value past the last page in the 0-256M node. Again when reserve_bootmem_node() returns, mark_reserved_regions_for_nid() loops around to set the rest of the crashkernel memory in the next node as reserved. It references NODE_DATA(node_ar.nid) and this causes another 'Oops: kernel access of bad area' problem. The following changes made the system to boot with any amount of crashkernel memory size. Fix code for reserved memory spanning acroos nodes Signed-off-by: Chandru S --- --- linux-2.6.28-rc9//arch/powerpc/mm/numa.c.orig 2008-12-22 04:23:24.0 -0600 +++ linux-2.6.28-rc9/arch/powerpc/mm/numa.c 2008-12-22 04:24:25.0 -0600 @@ -995,10 +995,11 @@ void __init do_init_bootmem(void) start_pfn, end_pfn); free_bootmem_with_active_regions(nid, end_pfn); + } + + for_each_online_node(nid) { /* -* Be very careful about moving this around. Future -* calls to careful_allocation() depend on this getting -* done correctly. +* Be very careful about moving this around. */ mark_reserved_regions_for_nid(nid); sparse_memory_present_with_active_regions(nid); --- linux-2.6.28-rc9/mm/bootmem.c.orig 2008-12-19 10:49:24.0 -0600 +++ linux-2.6.28-rc9/mm/bootmem.c 2008-12-19 10:49:33.0 -0600 @@ -375,10 +375,14 @@ int __init reserve_bootmem_node(pg_data_ unsigned long size, int flags) { unsigned long start, end; + bootmem_data_t *bdata = pgdat->bdata; start = PFN_DOWN(physaddr); end = PFN_UP(physaddr + size); + if (end > bdata->node_low_pfn) + end = bdata->node_low_pfn; + return mark_bootmem_node(pgdat->bdata, start, end, 1, flags); } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Tuesday 30 December 2008 03:06:07 Dave Hansen wrote: > On Fri, 2008-12-26 at 11:59 +1100, Paul Mackerras wrote: > > > + } > > > + > > > + for_each_online_node(nid) { > > > /* > > > - * Be very careful about moving this around. Future > > > - * calls to careful_allocation() depend on this getting > > > - * done correctly. > > > + * Be very careful about moving this around. > > >*/ > > > mark_reserved_regions_for_nid(nid); > > > sparse_memory_present_with_active_regions(nid); > > I think this reintroduces one of the bugs that I squashed. You *have* > to call mark_reserved_regions_for_nid() right after you do > free_bootmem_with_active_regions(). Otherwise, someone else can > bootmem_alloc() a reserved region from that node. Thanks for the review comments Dave. With the commit:a4c74ddd5ea3db53fc73d29c222b22656a7d05be, I see this has been taken care in mark_reserved_regions_for_nid(). In that case we may only need the change made in reserve_bootmem_node(). Hello Andrew, Could you please consider the following patch instead of the original patch in this thread. Thanks, When booted with crashkernel=2...@32m or any memory size less than this, the system boots properly. The system comes up with two nodes (0-256M and 256M-4GB). The crashkernel memory reservation spans across these two nodes. The mark_reserved_regions_for_nid() in arch/powerpc/numa.c resizes the reserved part of the memory within it as... if (end_pfn > node_ar.end_pfn) reserve_size = (node_ar.end_pfn << PAGE_SHIFT) - (start_pfn << PAGE_SHIFT); but the reserve_bootmem_node() in mm/bootmem.c raises the pfn value of end end = PFN_UP(physaddr + size); This causes end to get a value past the last page in the 0-256M node. The following change restricts the value of end if it exceeds the last pfn in a given node. Signed-off-by: Chandru S Cc: Dave Hansen --- mm/bootmem.c |4 1 file changed, 4 insertions(+) --- linux-2.6.28/mm/bootmem.c.orig 2009-01-05 20:42:12.0 +0530 +++ linux-2.6.28/mm/bootmem.c 2009-01-05 20:43:53.0 +0530 @@ -375,10 +375,14 @@ int __init reserve_bootmem_node(pg_data_ unsigned long size, int flags) { unsigned long start, end; + bootmem_data_t *bdata = pgdat->bdata; start = PFN_DOWN(physaddr); end = PFN_UP(physaddr + size); + if (end > bdata->node_low_pfn) + end = bdata->node_low_pfn; + return mark_bootmem_node(pgdat->bdata, start, end, 1, flags); } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Monday 05 January 2009 22:00:33 Dave Hansen wrote: > OK, I had to think about this for a good, long time. That's bad. :) > > There are two things that we're dealing with here: "active regions" and > the NODE_DATA's. The if() you've pasted above resizes the reservation > so that it fits into the current active region. However, as you noted, > we haven't resized it so that it fits into the NODE_DATA() that we're > looking at. We call into the bootmem code, and BUG_ON(). > > The thing I don't like about this is that it might hide bugs in other > callers. This really is a ppc-specific thing and, although what you > wrote will fix the bug on ppc, it will probably cause someone in the > future to call reserve_bootmem_node() with too large a reservation and > get a silent failure (not reserving the requested size) back. > > We really do need to go take a hard look at the whole interaction > between lmb's, node active regions, and the NUMA code some day. It has > kinda grown to be a bit ungainly. > > How about we just consult the NODE_DATA() in > mark_reserved_regions_for_nid() instead of reserve_bootmem_node()? I don't know how you wanted NODE_DATA() to be consulted here. i.e before calling reserve_bootmem_node() should we have a condition if (PFN_UP(physbase+reserve_size) > node_end_pfn) then resize reserve_size again so that PFN_UP() will equate to node_end_pfn ?? end Also I was wondering if in reserve_bootmem_node() end = PFN_DOWN() ; will do.. With the recent changes from you that went into 2.6.28 stable (commit:a4c74ddd5ea3db53fc73d29c222b22656a7d05be), it worked on the system with PFN_DOWN(). Thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: kernel panics with crashkernel=256M while booting
On Wednesday 07 January 2009 07:20:17 Benjamin Herrenschmidt wrote: > On Mon, 2008-12-22 at 16:14 +0530, Chandru wrote: > > On a ppc64 machine booting linux-2.6.28-rc9 with crashkernel=2...@32m > > boot parameter caused the kernel to panic while booting. Follwing is the > > console message.. > > This is a fix to generic code. Can you resubmit it to linux-mm and/or > linux-kernel ? You can still CC linuxppc-dev but it should be handled by > one of the core mm maintainers. > > Cheers, > Ben. Its been submitted to lkml too. You have also been added to the cc list of the thread. Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Wednesday 07 January 2009 22:55:25 Dave Hansen wrote: > > I'm just suggesting making your fix in the ppc code instead of in > mm/bootmem.c. > Here are the changes that helped to boot the kernel. Please review it. Thanks, Signed-off-by: Chandru S Cc: Dave Hansen --- arch/powerpc/mm/numa.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) --- linux-2.6.28/arch/powerpc/mm/numa.c.orig2009-01-08 03:20:41.0 -0600 +++ linux-2.6.28/arch/powerpc/mm/numa.c 2009-01-08 03:50:41.0 -0600 @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -898,9 +899,17 @@ static void mark_reserved_regions_for_ni * if reserved region extends past active region * then trim size to active region */ - if (end_pfn > node_ar.end_pfn) + if (end_pfn > node_ar.end_pfn) { reserve_size = (node_ar.end_pfn << PAGE_SHIFT) - (start_pfn << PAGE_SHIFT); + /* +* resize it further if the reservation could +* cross the last page in this node +*/ + if (PFN_UP(physbase+reserve_size) > +node_end_pfn) + reserve_size -= PAGE_SIZE; + } /* * Only worry about *this* node, others may not * yet have valid NODE_DATA(). ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Friday 09 January 2009 01:33:12 Dave Hansen wrote: > Now I'm even more confused. Could you please send a fully changelogged > patch that describes the problem, and how this fixes it? This just > seems like an off-by-one error, which isn't what I thought we had before > at all. > > I'm also horribly confused why PFN_UP is needed here. Is 'physbase' not > page aligned? reserve_size looks like it *has* to be. 'end_pfn' is > always (as far as I have ever seen in the kernel) the pfn of the page > after the area we are interested in and we treat it as such in that > function. In the case of an unaligned physbase, that wouldn't be true. > > Think of the case where we have a 1-byte reservation. start_pfn will > equal end_pfn and we won't go into that while loop at *all* and won't > reserve anything. > > Does 'end_pfn' need fixing? > Attached is the console log with debug command line parameters enabled and with couple of more debug statements added to the code. Please take a look at it. thanks, Chandru Using 0078209e bytes for initrd buffer Please wait, loading kernel... Allocated 00d0 bytes for kernel @ 02d0 Elf64 kernel loaded... Loading ramdisk... ramdisk loaded 0078209e @ 00d0 OF stdout device is: /vdevice/v...@3000 Hypertas detected, assuming LPAR ! command line: root=/dev/disk/by-id/scsi-35000cca0030c7be2-part3 xmon=on crashkernel=2...@32m debug bootmem_debug numa=debug memory layout at init: alloc_bottom : 0388 alloc_top: 1000 alloc_top_hi : 0001 rmo_top : 1000 ram_top : 0001 Looking for displays instantiating rtas at 0x0f4e ... done boot cpu hw idx copying OF device tree ... Building dt strings... Building dt structure... Device tree strings 0x03a9 -> 0x03a917bc Device tree struct 0x03aa -> 0x03ac Calling quiesce ... returning from prom_init Reserving 256MB of memory at 32MB for crashkernel (System RAM: 4096MB) Phyp-dump disabled at boot time Using pSeries machine description Page orders: linear mapping = 24, virtual = 16, io = 12 Using 1TB segments Found initrd at 0xc0d0:0xc148209e console [udbg0] enabled Partition configured for 4 cpus. CPU maps initialized for 2 threads per core (thread shift is 1) Starting Linux PPC64 #7 SMP Fri Jan 9 04:50:05 CST 2009 - ppc64_pft_size= 0x1a physicalMemorySize= 0x1 htab_hash_mask= 0x7 - Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.28-1-ppc64 (r...@rulerlp10) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #7 SMP Fri Jan 9 04:50:05 CST 2009 [boot]0012 Setup Arch NUMA associativity depth for CPU/Memory: 3 Node 0 Memory: Node 5 Memory: 0x0-0x1000 Node 7 Memory: 0x1000-0x1 adding cpu 0 to node 0 node 0 NODE_DATA() = c000fffeda80 node 5 NODE_DATA() = c1f78800 start_paddr = 0 end_paddr = 1000 bootmap_paddr = 1f6 bootmem::init_bootmem_core nid=5 start=0 map=1f6 end=1000 mapsize=200 bootmem::mark_bootmem_node nid=5 start=0 end=1000 reserve=0 flags=0 bdata->node_min_pfn=0x0 bdata->node_low_pfn=0x1000 bootmem::__free nid=5 start=0 end=1000 reserve_bootmem : physbase :0x0 size:0xb7 reserve_size=0xb7 nid=5 start_pfn:0x0 end_pfn:0xb7 node_ar.start_pfn:0x0 node_ar.end_pfn:0x1000 node->node_start_pfn:0x0 node->node_spanned_pages:4096 bootmem::mark_bootmem_node nid=5 start=0 end=b7 reserve=1 flags=0 bdata->node_min_pfn=0x0 bdata->node_low_pfn=0x1000 bootmem::__reserve nid=5 start=0 end=b7 flags=0 reserve_bootmem : physbase :0xd0 size:0x78209e reserve_size=0x78209e nid=5 start_pfn:0xd0 end_pfn:0x148 node_ar.start_pfn:0x0 node_ar.end_pfn:0x1000 node->node_start_pfn:0x0 node->node_spanned_pages:4096 bootmem::mark_bootmem_node nid=5 start=d0 end=149 reserve=1 flags=0 bdata->node_min_pfn=0x0 bdata->node_low_pfn=0x1000 bootmem::__reserve nid=5 start=d0 end=149 flags=0 reserve_bootmem : physbase :0xd0 size:0x79 reserve_size=0x79 nid=5 start_pfn:0xd0 end_pfn:0x149 node_ar.start_pfn:0x0 node_ar.end_pfn:0x1000 node->node_start_pfn:0x0 node->node_spanned_pages:4096 bootmem::mark_bootmem_node nid=5 start=d0 end=149 reserve=1 flags=0 bdata->node_min_pfn=0x0 bdata->node_low_pfn=0x1000 bootmem::__reserve nid=5 start=d0 end=149 flags=0 bootmem::__reserve silent double reserve of PFN d0 bootmem::__reserve silent double reserve of PFN d1 ... ... ... bootmem::__reserve silent double reserve of PFN 147 bootmem::__reserve silent double reserve of PFN
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Friday 09 January 2009 16:37:24 Chandru wrote: > On Friday 09 January 2009 01:33:12 Dave Hansen wrote: > > Now I'm even more confused. Could you please send a fully changelogged > > patch that describes the problem, and how this fixes it? This just > > seems like an off-by-one error, which isn't what I thought we had before > > at all. > > > > I'm also horribly confused why PFN_UP is needed here. Is 'physbase' not > > page aligned? reserve_size looks like it *has* to be. 'end_pfn' is > > always (as far as I have ever seen in the kernel) the pfn of the page > > after the area we are interested in and we treat it as such in that > > function. In the case of an unaligned physbase, that wouldn't be true. > > > > Think of the case where we have a 1-byte reservation. start_pfn will > > equal end_pfn and we won't go into that while loop at *all* and won't > > reserve anything. > > > > Does 'end_pfn' need fixing? > > > > Attached is the console log with debug command line parameters enabled and > with couple of more debug statements added to the code. Please take a look > at it. > > thanks, > Chandru > Hello Dave, From the debug console output, if there is anything you can add here, pls let me know. thanks ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Thursday 15 January 2009 13:35:27 Chandru wrote: > Hello Dave, From the debug console output, if there is anything you can add > here, pls let me know. As we can see from the console output here, physbase isn't page aligned when the panic occurs. So we could as well send (start_pfn << PAGE_SHIFT) to reserve_bootmem_node() instead of physbase. your thoughts ?. Also end_pfn in mark_reserved_region_for_nid() is defined as unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); Does this refer to the pfn after the area that we are interested in ?. We have atleast two fixes here, 1. Limit start and end to bdata->node_min_pfn and bdata->node_low_pfn in reserve_bootmem_node() and add comments out in there that the caller of the funtion should be aware of how much are they reserving. 2. send (start_pfn << PAGE_SHIFT) to reserve_bootmem_node() instead of physbase. Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Friday 16 January 2009 23:22:57 Dave Hansen wrote: > Just looking at it, that calculation is OK. But, there was one in your > dmesg that looked a page too long, like page 0x1001 instead of 0x1000. > I'd find out how that happened. That is a result of PFN_UP() in reserve_bootmem_node() for which we hit the BUG_ON() eventually. Prior to calling reserve_bootmem_node() we have... node_ar.end_pfn = node->node_end_pfn = PFN_DOWN(physbase+reserve_size). Hence a PFN_UP() will raise the value of 'end'. The kernel has CONFIG_PPC_64K_PAGES enabled in the config. Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
In case either physbase or reserve_size are not page aligned and in addition if the following condition is also true node_ar.end_pfn = node->node_end_pfn = PFN_DOWN(physbase+reserve_size). we may hit the BUG_ON(end > bdata->node_low_pfn) in mark_bootmem_node() in mm/bootmem.c Hence pass the pfn that the physbase is part of and align reserve_size before calling reserve_bootmem_node(). Signed-off-by: Chandru S Cc: Dave Hansen --- arch/powerpc/mm/numa.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- linux-2.6.29-rc2/arch/powerpc/mm/numa.c.orig2009-01-19 16:14:49.0 +0530 +++ linux-2.6.29-rc2/arch/powerpc/mm/numa.c 2009-01-19 16:36:38.0 +0530 @@ -901,7 +901,8 @@ static void mark_reserved_regions_for_ni get_node_active_region(start_pfn, &node_ar); while (start_pfn < end_pfn && node_ar.start_pfn < node_ar.end_pfn) { - unsigned long reserve_size = size; + unsigned long reserve_size = (size >> PAGE_SHIFT) << + PAGE_SHIFT; /* * if reserved region extends past active region * then trim size to active region @@ -917,7 +918,8 @@ static void mark_reserved_regions_for_ni dbg("reserve_bootmem %lx %lx nid=%d\n", physbase, reserve_size, node_ar.nid); reserve_bootmem_node(NODE_DATA(node_ar.nid), - physbase, reserve_size, + (start_pfn << PAGE_SHIFT), + reserve_size, BOOTMEM_DEFAULT); } /* ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
Chandru wrote: In case either physbase or reserve_size are not page aligned and in addition if the following condition is also true node_ar.end_pfn = node->node_end_pfn = PFN_DOWN(physbase+reserve_size). we may hit the BUG_ON(end > bdata->node_low_pfn) in mark_bootmem_node() in mm/bootmem.c Hence pass the pfn that the physbase is part of and align reserve_size before calling reserve_bootmem_node(). Signed-off-by: Chandru S Cc: Dave Hansen --- arch/powerpc/mm/numa.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) --- linux-2.6.29-rc2/arch/powerpc/mm/numa.c.orig2009-01-19 16:14:49.0 +0530 +++ linux-2.6.29-rc2/arch/powerpc/mm/numa.c 2009-01-19 16:36:38.0 +0530 @@ -901,7 +901,8 @@ static void mark_reserved_regions_for_ni get_node_active_region(start_pfn, &node_ar); while (start_pfn < end_pfn && node_ar.start_pfn < node_ar.end_pfn) { - unsigned long reserve_size = size; + unsigned long reserve_size = (size >> PAGE_SHIFT) << + PAGE_SHIFT; /* * if reserved region extends past active region * then trim size to active region @@ -917,7 +918,8 @@ static void mark_reserved_regions_for_ni dbg("reserve_bootmem %lx %lx nid=%d\n", physbase, reserve_size, node_ar.nid); reserve_bootmem_node(NODE_DATA(node_ar.nid), - physbase, reserve_size, + (start_pfn << PAGE_SHIFT), + reserve_size, BOOTMEM_DEFAULT); } /* does this patch look good ?, do you concur with it ? thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: 2.6.28-rc9 panics with crashkernel=256M while booting
On Thursday 22 January 2009 05:59:39 Dave Hansen wrote: > Let's take, for instance, a 1-byte reservation. With this code, you've > suddenly turned that into a 0-byte reservation, and that *can't* be > right. The same thing happens if you have a reservation that spans two > pages. If you unconditionally round it down, then you might miss the > part that spans a portion of the second page. > > It needs to be rounded down like you are suggesting here, but only in > the case where we've gone over the *CURRENT* node's boundary. This is > kinda what that "if (end_pfn > node_ar.end_pfn)" check is doing. But, > it evidently screws it up if the overlap isn't by an entire page or > something. I assumed the condition 'while (start_pfn < end_pfn && .. )' asks for atleast a PAGE_SIZE difference between them and hence went ahead with that patch. My guess was a 1-byte , 2-byte or a (PAGE_SIZE -1)-byte reservations may not even go into that loop. However we just need a fix for this problem. So if there is a better fix that you have please post it to lkml. Thanks, Chandru ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev