Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-25 Thread Milton Miller

On Nov 24, 2008, at 6:10 PM, Michael Ellerman wrote:

On Mon, 2008-11-24 at 14:07 -0600, Hollis Blanchard wrote:

On Fri, 2008-11-14 at 16:09 -0600, Hollis Blanchard wrote:


If this is all too much, then I'm close to giving up and burning a
64KB page, which requires only ALIGN_DOWN() in the kernel.


ppc: force memory size to be a multiple of PAGE_SIZE

Ensure that total memory size is page-aligned, because otherwise
bootmem.c gets upset.

This error case was triggered by using 64 KiB pages in the kernel 
while
arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory by 
4096 (to
work around the CHIP11 errata which affects the last 256 bytes of 
physical memory).


Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
---
This is on a common code path, and lmb_enforce_memory_limit() will now
always take action, so wider testing would be good.

This patch supercedes http://patchwork.ozlabs.org/patch/8211/ .

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -1200,6 +1200,11 @@ void __init early_init_devtree(void *par
early_reserve_mem();
phyp_dump_reserve_mem();

+   /* Ensure that total memory size is page-aligned, because otherwise
+* bootmem.c gets upset. */
+   lmb_analyze();
+   memory_limit = lmb_phys_mem_size()  PAGE_MASK;


All of the current code using memory_limit looks like it'll be safe 
with

this change, although there are several cases of this we could remove:

if (memory_limit  some other condition)

Because memory_limit will now always be true.


memory_limit was the result of parsing mem= from the command line.   
Does this break that?




Still, I think it would be better to only set memory_limit when the mem
size is not a multiple of the PAGE_SIZE - so that memory_limit retains
it's function as both the value of the limit and a boolean.


I would have expected this trimming to occur where we actually transfer 
the memory from lmb to bootmem, since it is bootmem that has the 
aligned size requirement.


milton

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-25 Thread Hollis Blanchard
On Tue, 2008-11-25 at 11:10 +1100, Michael Ellerman wrote:
 
 Still, I think it would be better to only set memory_limit when the mem
 size is not a multiple of the PAGE_SIZE - so that memory_limit retains
 it's function as both the value of the limit and a boolean.

OK, how's this?

ppc: force memory size to be a multiple of PAGE_SIZE

Ensure that total memory size is page-aligned, because otherwise
mark_bootmem() gets upset.

This error case was triggered by using 64 KiB pages in the kernel while
arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory by 4096 (to
work around a chip bug that affects the last 256 bytes of physical memory).

Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -1160,6 +1160,8 @@ static inline void __init phyp_dump_rese
 
 void __init early_init_devtree(void *params)
 {
+   unsigned long limit;
+
DBG( - early_init_devtree(%p)\n, params);
 
/* Setup flat device-tree pointer */
@@ -1200,7 +1202,15 @@ void __init early_init_devtree(void *par
early_reserve_mem();
phyp_dump_reserve_mem();
 
-   lmb_enforce_memory_limit(memory_limit);
+   limit = memory_limit;
+   if (! limit) {
+   /* Ensure that total memory size is page-aligned, because
+* otherwise mark_bootmem() gets upset. */
+   lmb_analyze();
+   limit = lmb_phys_mem_size()  PAGE_MASK;
+   }
+   lmb_enforce_memory_limit(limit);
+
lmb_analyze();
 
DBG(Phys. mem: %lx\n, lmb_phys_mem_size());

-- 
Hollis Blanchard
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-25 Thread Michael Ellerman
On Tue, 2008-11-25 at 15:53 -0600, Hollis Blanchard wrote:
 On Tue, 2008-11-25 at 11:10 +1100, Michael Ellerman wrote:
  
  Still, I think it would be better to only set memory_limit when the mem
  size is not a multiple of the PAGE_SIZE - so that memory_limit retains
  it's function as both the value of the limit and a boolean.
 
 OK, how's this?
 
 ppc: force memory size to be a multiple of PAGE_SIZE
 
 Ensure that total memory size is page-aligned, because otherwise
 mark_bootmem() gets upset.
 
 This error case was triggered by using 64 KiB pages in the kernel while
 arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory by 4096 (to
 work around a chip bug that affects the last 256 bytes of physical memory).
 
 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
 
 diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
 --- a/arch/powerpc/kernel/prom.c
 +++ b/arch/powerpc/kernel/prom.c
 @@ -1160,6 +1160,8 @@ static inline void __init phyp_dump_rese
  
  void __init early_init_devtree(void *params)
  {
 + unsigned long limit, memsize;
 +
   DBG( - early_init_devtree(%p)\n, params);
  
   /* Setup flat device-tree pointer */
 @@ -1200,7 +1202,15 @@ void __init early_init_devtree(void *par
   early_reserve_mem();
   phyp_dump_reserve_mem();

I was thinking more like the following:

  
 - lmb_enforce_memory_limit(memory_limit);
 + limit = memory_limit;
 + if (! limit) {
 + /* Ensure that total memory size is page-aligned, because
 +  * otherwise mark_bootmem() gets upset. */
 + lmb_analyze();
  memsize = lmb_phys_mem_size();
  if(memsize  PAGE_MASK != memsize)
limit = memsize  PAGE_MASK;
 + }
 + lmb_enforce_memory_limit(limit);
 +

So that we never needlessly run through the enforce code with limit =
memsize. But maybe it's a bit pedantic.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-24 Thread Hollis Blanchard
On Fri, 2008-11-14 at 16:09 -0600, Hollis Blanchard wrote:
 
 If this is all too much, then I'm close to giving up and burning a
 64KB page, which requires only ALIGN_DOWN() in the kernel.

ppc: force memory size to be a multiple of PAGE_SIZE

Ensure that total memory size is page-aligned, because otherwise
bootmem.c gets upset.

This error case was triggered by using 64 KiB pages in the kernel while
arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory by 4096 (to
work around the CHIP11 errata which affects the last 256 bytes of physical 
memory).

Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
---
This is on a common code path, and lmb_enforce_memory_limit() will now
always take action, so wider testing would be good.

This patch supercedes http://patchwork.ozlabs.org/patch/8211/ .

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -1200,6 +1200,11 @@ void __init early_init_devtree(void *par
early_reserve_mem();
phyp_dump_reserve_mem();
 
+   /* Ensure that total memory size is page-aligned, because otherwise
+* bootmem.c gets upset. */
+   lmb_analyze();
+   memory_limit = lmb_phys_mem_size()  PAGE_MASK;
+
lmb_enforce_memory_limit(memory_limit);
lmb_analyze();
 
-- 
Hollis Blanchard
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-24 Thread Michael Ellerman
On Mon, 2008-11-24 at 14:07 -0600, Hollis Blanchard wrote:
 On Fri, 2008-11-14 at 16:09 -0600, Hollis Blanchard wrote:
  
  If this is all too much, then I'm close to giving up and burning a
  64KB page, which requires only ALIGN_DOWN() in the kernel.
 
 ppc: force memory size to be a multiple of PAGE_SIZE
 
 Ensure that total memory size is page-aligned, because otherwise
 bootmem.c gets upset.
 
 This error case was triggered by using 64 KiB pages in the kernel while
 arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory by 4096 (to
 work around the CHIP11 errata which affects the last 256 bytes of physical 
 memory).
 
 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
 ---
 This is on a common code path, and lmb_enforce_memory_limit() will now
 always take action, so wider testing would be good.
 
 This patch supercedes http://patchwork.ozlabs.org/patch/8211/ .
 
 diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
 --- a/arch/powerpc/kernel/prom.c
 +++ b/arch/powerpc/kernel/prom.c
 @@ -1200,6 +1200,11 @@ void __init early_init_devtree(void *par
   early_reserve_mem();
   phyp_dump_reserve_mem();
  
 + /* Ensure that total memory size is page-aligned, because otherwise
 +  * bootmem.c gets upset. */
 + lmb_analyze();
 + memory_limit = lmb_phys_mem_size()  PAGE_MASK;

All of the current code using memory_limit looks like it'll be safe with
this change, although there are several cases of this we could remove:

if (memory_limit  some other condition)

Because memory_limit will now always be true.

Still, I think it would be better to only set memory_limit when the mem
size is not a multiple of the PAGE_SIZE - so that memory_limit retains
it's function as both the value of the limit and a boolean.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-18 Thread Hollis Blanchard
On Fri, 2008-11-14 at 16:09 -0600, Hollis Blanchard wrote:
 
 Basically my revised proposal is to add explicit memory reservation 
 properties 
 to the device tree. Currently, /memreserve properties in .dts files are not 
 present in the device tree itself, only in the FDT header. I think these 
 reservations should be duplicated in the tree itself, so that they become 
 visible to post-boot tools like kexec.
 
 In summary, all memory reservations will then exist both in the device tree 
 and in the FDT header. Comments?
 
 Impact to uboot: revert memory node truncation; create reservation 
 and /memreserve property.
 
 Impact to cuboot wrapper: revert memory node truncation; create reservation 
 and /memreserve property.
 
 Impact to kernel: none. /memreserve will be ignored, since memory 
 reservations 
 are already handled properly.
 
 Impact to kexec-tools: Must take /memreserve into account when placing data 
 at the end of memory.
 
 If this is all too much, then I'm close to giving up and burning a 64KB page, 
 which requires only ALIGN_DOWN() in the kernel.

Any comments?

-- 
Hollis Blanchard
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-14 Thread Milton Miller

In-Reply-To: [EMAIL PROTECTED]


On Fri Nov 14 at 06:54:15 EST in 2008, Hollis Blanchard wrote:

On Thu, 2008-11-13 at 07:44 +1100, Benjamin Herrenschmidt wrote:

Again, why can't we just stick something in the kernel code that
reserves the last page ? It could be in prom.c or it could be called 
by

affected 4xx platforms by the platform code, whatever, but the reserve
map isn't really meant for that and will not be passed over from 
kernel

to kernel by kexec.


Reserving a page is overkill; only the last 256 bytes are affected. We
need to intercept at the LMB level, because allocations are already 
done

there, so by the time we hit bootmem it's way too late.


I agree with Ben we need to have something in the tree to tell kexec 
and or the kernel of this errata, unless we adapt the kernel to not 
require the memory node be page size aligned.


I instigated a discussion with Josh and Hollis on irc.


I simply don't see a good place to do this in the kernel. It would have
to be before the first lmb_alloc() call, which for safety would put it
inside early_init_devtree() -- along with the other lmb_reserve()
calls.[1]

[1] This is exactly where flat device tree reservations are done, and
that's why the patch I submitted works.




However, ppc_md.probe() hasn't even been called yet, so there's no way
of knowing if we're on an affected system, unless you want to add a
special of_scan_flat_dt() call here.


I think we decided a property is the right way to go, but am not sure 
we decided if it should be a specific property in the /cpus/[EMAIL PROTECTED] nodes 
or a general property that describes a base and length ... in which 
case it is either a property in /memory (cpus nodes are not part of the 
system address space, with an independent size 0 address space).   It 
was also noted if we go the property route. that kexec tools would need 
to know about it since it allocates destination pages based on reading 
/memory reg ranges, although it also has a hardcoded 768M limit which 
might hide this.



I'm open to suggestions, but I don't see a better way than what I
already sent. I think the important part is to call lmb_add() for all
memory, but lmb_reserve() the last 256 bytes before lmb_alloc() 
happens.


It sounds like kexec must have some knowledge of the platform and 
device

tree already, so is this really a big deal? At any rate, this
conversation is somewhat academic, since there is no kexec on 44x... so
maybe this can be re-addressed when that becomes a real issue.


As discussed, kexec userspace has some ideas of platforms, but its very 
general and should not have lists of which cpus have an errata but 
should base all its decisions off the device tree.


Alternatives to adding a property include just trimming the memory node 
(and fixing the kernel to handle memory size not being page aligned), 
and adding an additional node that says this memory is in use.  We 
should handle the memory size not some big power of 2 anyways, and if 
we just create a new node it should not overlap the memory node 
anyways.  Although we did note that due to current kexec implementation 
we can name a node starting with /rtas and use linux,rtas-base and 
rtas-size to reserve any 32 bit chunk of memory even to kexec, although 
that is considered beyond acceptable for this errata fix (some else 
might want to join me in using that to reserve memory for log buffers 
across boot).


It has been described to me that the bug affects any access to the 256 
bytes, so it would be accurate to describe the memory as not existing 
or as this cpu has an errata tnd the dram is really here.  I just say 
it needs to be described in the device tree.  Trimming the memory node 
has the advantage that kexec userspace will not need a patch, adding 
the cpu has errata property would only require a patch for platofrms 
with 768MB (or manual override of the usable memory size via the 
command line).



milton

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-14 Thread Milton Miller

Resend with correct reply threading.

On Fri Nov 14 at 06:54:15 EST in 2008, Hollis Blanchard wrote:

On Thu, 2008-11-13 at 07:44 +1100, Benjamin Herrenschmidt wrote:

Again, why can't we just stick something in the kernel code that
reserves the last page ? It could be in prom.c or it could be called 
by

affected 4xx platforms by the platform code, whatever, but the reserve
map isn't really meant for that and will not be passed over from 
kernel

to kernel by kexec.


Reserving a page is overkill; only the last 256 bytes are affected. We
need to intercept at the LMB level, because allocations are already 
done

there, so by the time we hit bootmem it's way too late.


I agree with Ben we need to have something in the tree to tell kexec 
and or the kernel of this errata, unless we adapt the kernel to not 
require the memory node be page size aligned.


I instigated a discussion with Josh and Hollis on irc.


I simply don't see a good place to do this in the kernel. It would have
to be before the first lmb_alloc() call, which for safety would put it
inside early_init_devtree() -- along with the other lmb_reserve()
calls.[1]

[1] This is exactly where flat device tree reservations are done, and
that's why the patch I submitted works.




However, ppc_md.probe() hasn't even been called yet, so there's no way
of knowing if we're on an affected system, unless you want to add a
special of_scan_flat_dt() call here.


I think we decided a property is the right way to go, but am not sure 
we decided if it should be a specific property in the /cpus/[EMAIL PROTECTED] nodes 
or a general property that describes a base and length ... in which 
case it is either a property in /memory (cpus nodes are not part of the 
system address space, with an independent size 0 address space).   It 
was also noted if we go the property route. that kexec tools would need 
to know about it since it allocates destination pages based on reading 
/memory reg ranges, although it also has a hardcoded 768M limit which 
might hide this.



I'm open to suggestions, but I don't see a better way than what I
already sent. I think the important part is to call lmb_add() for all
memory, but lmb_reserve() the last 256 bytes before lmb_alloc() 
happens.


It sounds like kexec must have some knowledge of the platform and 
device

tree already, so is this really a big deal? At any rate, this
conversation is somewhat academic, since there is no kexec on 44x... so
maybe this can be re-addressed when that becomes a real issue.


As discussed, kexec userspace has some ideas of platforms, but its very 
general and should not have lists of which cpus have an errata but 
should base all its decisions off the device tree.


Alternatives to adding a property include just trimming the memory node 
(and fixing the kernel to handle memory size not being page aligned), 
and adding an additional node that says this memory is in use.  We 
should handle the memory size not some big power of 2 anyways, and if 
we just create a new node it should not overlap the memory node 
anyways.  Although we did note that due to current kexec implementation 
we can name a node starting with /rtas and use linux,rtas-base and 
rtas-size to reserve any 32 bit chunk of memory even to kexec, although 
that is considered beyond acceptable for this errata fix (some else 
might want to join me in using that to reserve memory for log buffers 
across boot).


It has been described to me that the bug affects any access to the 256 
bytes, so it would be accurate to describe the memory as not existing 
or as this cpu has an errata tnd the dram is really here.  I just say 
it needs to be described in the device tree.  Trimming the memory node 
has the advantage that kexec userspace will not need a patch, adding 
the cpu has errata property would only require a patch for platofrms 
with 768MB (or manual override of the usable memory size via the 
command line).



milton


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-14 Thread Hollis Blanchard
On Friday 14 November 2008 11:29:35 Milton Miller wrote:
  I simply don't see a good place to do this in the kernel. It would have
  to be before the first lmb_alloc() call, which for safety would put it
  inside early_init_devtree() -- along with the other lmb_reserve()
  calls.[1]
 
  [1] This is exactly where flat device tree reservations are done, and
  that's why the patch I submitted works.
 
 
  However, ppc_md.probe() hasn't even been called yet, so there's no way
  of knowing if we're on an affected system, unless you want to add a
  special of_scan_flat_dt() call here.
 
 I think we decided a property is the right way to go, but am not sure 
 we decided if it should be a specific property in the /cpus/[EMAIL PROTECTED] 
 nodes 
 or a general property that describes a base and length ... in which 
 case it is either a property in /memory (cpus nodes are not part of the 
 system address space, with an independent size 0 address space).   It 
 was also noted if we go the property route. that kexec tools would need 
 to know about it since it allocates destination pages based on reading 
 /memory reg ranges, although it also has a hardcoded 768M limit which 
 might hide this.
 
  I'm open to suggestions, but I don't see a better way than what I
  already sent. I think the important part is to call lmb_add() for all
  memory, but lmb_reserve() the last 256 bytes before lmb_alloc() 
  happens.
 
  It sounds like kexec must have some knowledge of the platform and 
  device
  tree already, so is this really a big deal? At any rate, this
  conversation is somewhat academic, since there is no kexec on 44x... so
  maybe this can be re-addressed when that becomes a real issue.
 
 As discussed, kexec userspace has some ideas of platforms, but its very 
 general and should not have lists of which cpus have an errata but 
 should base all its decisions off the device tree.
 
 Alternatives to adding a property include just trimming the memory node 
 (and fixing the kernel to handle memory size not being page aligned),
 and adding an additional node that says this memory is in use.  We 
 should handle the memory size not some big power of 2 anyways, and if 
 we just create a new node it should not overlap the memory node 
 anyways.  Although we did note that due to current kexec implementation 
 we can name a node starting with /rtas and use linux,rtas-base and 
 rtas-size to reserve any 32 bit chunk of memory even to kexec, although 
 that is considered beyond acceptable for this errata fix (some else 
 might want to join me in using that to reserve memory for log buffers 
 across boot).
 
 It has been described to me that the bug affects any access to the 256 
 bytes, so it would be accurate to describe the memory as not existing 
 or as this cpu has an errata tnd the dram is really here.  I just say 
 it needs to be described in the device tree.  Trimming the memory node 
 has the advantage that kexec userspace will not need a patch, adding 
 the cpu has errata property would only require a patch for platofrms 
 with 768MB (or manual override of the usable memory size via the 
 command line).

I don't think patching kexec userspace is too onerous, especially if it's done 
now long before kexec exists on 440. That would also allow you to drop your 
rtas hack...

Basically my revised proposal is to add explicit memory reservation properties 
to the device tree. Currently, /memreserve properties in .dts files are not 
present in the device tree itself, only in the FDT header. I think these 
reservations should be duplicated in the tree itself, so that they become 
visible to post-boot tools like kexec.

In summary, all memory reservations will then exist both in the device tree 
and in the FDT header. Comments?

Impact to uboot: revert memory node truncation; create reservation 
and /memreserve property.

Impact to cuboot wrapper: revert memory node truncation; create reservation 
and /memreserve property.

Impact to kernel: none. /memreserve will be ignored, since memory reservations 
are already handled properly.

Impact to kexec-tools: Must take /memreserve into account when placing data 
at the end of memory.

If this is all too much, then I'm close to giving up and burning a 64KB page, 
which requires only ALIGN_DOWN() in the kernel.

-- 
Hollis Blanchard
IBM Linux Technology Center
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-13 Thread Hollis Blanchard
On Thu, 2008-11-13 at 07:44 +1100, Benjamin Herrenschmidt wrote:
 
 Again, why can't we just stick something in the kernel code that
 reserves the last page ? It could be in prom.c or it could be called by
 affected 4xx platforms by the platform code, whatever, but the reserve
 map isn't really meant for that and will not be passed over from kernel
 to kernel by kexec.

Reserving a page is overkill; only the last 256 bytes are affected. We
need to intercept at the LMB level, because allocations are already done
there, so by the time we hit bootmem it's way too late.

I simply don't see a good place to do this in the kernel. It would have
to be before the first lmb_alloc() call, which for safety would put it
inside early_init_devtree() -- along with the other lmb_reserve()
calls.[1]

However, ppc_md.probe() hasn't even been called yet, so there's no way
of knowing if we're on an affected system, unless you want to add a
special of_scan_flat_dt() call here.

I'm open to suggestions, but I don't see a better way than what I
already sent. I think the important part is to call lmb_add() for all
memory, but lmb_reserve() the last 256 bytes before lmb_alloc() happens.

It sounds like kexec must have some knowledge of the platform and device
tree already, so is this really a big deal? At any rate, this
conversation is somewhat academic, since there is no kexec on 44x... so
maybe this can be re-addressed when that becomes a real issue.

[1] This is exactly where flat device tree reservations are done, and
that's why the patch I submitted works.

-- 
Hollis Blanchard
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-12 Thread Benjamin Herrenschmidt
On Wed, 2008-11-12 at 06:31 -0500, Josh Boyer wrote:
 On Wed, 12 Nov 2008 15:37:43 +1100
 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
 
  On Tue, 2008-11-11 at 18:06 -0600, Hollis Blanchard wrote:
   The current CHIP11 errata truncates the device tree memory node, and 
   subtracts
   (hardcoded) 4096 bytes. This breaks kernels with larger PAGE_SIZE, since 
   the
   bootmem allocator assumes that total memory is a multiple of PAGE_SIZE.
   
   Instead, use a device tree memory reservation to reserve only the 256 
   bytes
   actually affected by the errata, leaving the total memory size unaltered.
   
   Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
  
  While I prefer this approach, won't it break kexec ?
 
 Break it how?  Particularly given that kexec doesn't work on 4xx (yet).

Allright, wrong wording. It will make kexec more painful since it will
have to also create that reserved area in the target DT.

 I don't think that's it.  I think it's more that we're opportunistic and
 the wrapper is the easiest place to do this, given that U-Boot itself
 will be doing the reserve for platforms that don't require the
 wrapper.
 
 So we could do the fixup in-kernel, but how do you do that
 deterministically given that U-Boot might have already done it?

Bah, do you know many RAM chip that will chop off the last 4K ?

I still find it a bit tricky to have memory nodes not aligned on nice
fat big boundaries tho.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-12 Thread Hollis Blanchard
On Wed, 2008-11-12 at 22:52 +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2008-11-12 at 06:31 -0500, Josh Boyer wrote:
  On Wed, 12 Nov 2008 15:37:43 +1100
  Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
  
   On Tue, 2008-11-11 at 18:06 -0600, Hollis Blanchard wrote:
The current CHIP11 errata truncates the device tree memory node, and 
subtracts
(hardcoded) 4096 bytes. This breaks kernels with larger PAGE_SIZE, 
since the
bootmem allocator assumes that total memory is a multiple of PAGE_SIZE.

Instead, use a device tree memory reservation to reserve only the 256 
bytes
actually affected by the errata, leaving the total memory size 
unaltered.

Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
   
   While I prefer this approach, won't it break kexec ?
  
  Break it how?  Particularly given that kexec doesn't work on 4xx (yet).
 
 Allright, wrong wording. It will make kexec more painful since it will
 have to also create that reserved area in the target DT.
 
  I don't think that's it.  I think it's more that we're opportunistic and
  the wrapper is the easiest place to do this, given that U-Boot itself
  will be doing the reserve for platforms that don't require the
  wrapper.
  
  So we could do the fixup in-kernel, but how do you do that
  deterministically given that U-Boot might have already done it?
 
 Bah, do you know many RAM chip that will chop off the last 4K ?

Forget pages. The errata is about the last 256 bytes of physical memory.

 I still find it a bit tricky to have memory nodes not aligned on nice
 fat big boundaries tho.

I don't know what you're referring to. The patch I sent doesn't touch
memory nodes, so they are indeed still aligned on nice fat big
boundaries.

I don't think this is overengineering at all. We can't touch the last
256 bytes, so we mark it reserved, and then we won't. Altering memory
nodes is far more complicated and error-prone.

-- 
Hollis Blanchard
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-12 Thread Benjamin Herrenschmidt
On Wed, 2008-11-12 at 09:11 -0600, Hollis Blanchard wrote:
 Forget pages. The errata is about the last 256 bytes of physical
 memory.
 
  I still find it a bit tricky to have memory nodes not aligned on
 nice
  fat big boundaries tho.
 
 I don't know what you're referring to. The patch I sent doesn't touch
 memory nodes, so they are indeed still aligned on nice fat big
 boundaries.

My last comment was about the approach of modifying the memory node.

 I don't think this is overengineering at all. We can't touch the last
 256 bytes, so we mark it reserved, and then we won't. Altering memory
 nodes is far more complicated and error-prone.

But your approach is going to be painful for kexec which will have to
duplicate that logic.

Again, why can't we just stick something in the kernel code that
reserves the last page ? It could be in prom.c or it could be called by
affected 4xx platforms by the platform code, whatever, but the reserve
map isn't really meant for that and will not be passed over from kernel
to kernel by kexec.

Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-11 Thread Hollis Blanchard
The current CHIP11 errata truncates the device tree memory node, and subtracts
(hardcoded) 4096 bytes. This breaks kernels with larger PAGE_SIZE, since the
bootmem allocator assumes that total memory is a multiple of PAGE_SIZE.

Instead, use a device tree memory reservation to reserve only the 256 bytes
actually affected by the errata, leaving the total memory size unaltered.

Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

---

Changes from v2:
- David pointed out I'd duplicated the fdt_add_mem_rsv() prototype, and that
  4xx.c should directly include libfdt/libfdt.h instead.

Using large pages results in a huge performance improvement for KVM, and this
patch is required to make Ilya's large page patch work. David and/or Josh,
please apply.

diff --git a/arch/powerpc/boot/4xx.c b/arch/powerpc/boot/4xx.c
--- a/arch/powerpc/boot/4xx.c
+++ b/arch/powerpc/boot/4xx.c
@@ -20,8 +20,9 @@
 #include ops.h
 #include reg.h
 #include dcr.h
+#include libfdt/libfdt.h
 
-static unsigned long chip_11_errata(unsigned long memsize)
+static void chip_11_errata(unsigned long memsize)
 {
unsigned long pvr;
 
@@ -31,13 +32,11 @@ static unsigned long chip_11_errata(unsi
case 0x4850:
case 0x48d0:
case 0x28d0:
-   memsize -= 4096;
+   fdt_add_mem_rsv(fdt, memsize - 256, 256);
break;
default:
break;
}
-
-   return memsize;
 }
 
 /* Read the 4xx SDRAM controller to get size of system memory. */
@@ -53,7 +52,7 @@ void ibm4xx_sdram_fixup_memsize(void)
memsize += SDRAM_CONFIG_BANK_SIZE(bank_config);
}
 
-   memsize = chip_11_errata(memsize);
+   chip_11_errata(memsize);
dt_fixup_memory(0, memsize);
 }
 
@@ -219,7 +218,7 @@ void ibm4xx_denali_fixup_memsize(void)
bank = 4; /* 4 banks */
 
memsize = cs * (1  (col+row)) * bank * dpath;
-   memsize = chip_11_errata(memsize);
+   chip_11_errata(memsize);
dt_fixup_memory(0, memsize);
 }
 
diff --git a/arch/powerpc/boot/libfdt-wrapper.c 
b/arch/powerpc/boot/libfdt-wrapper.c
--- a/arch/powerpc/boot/libfdt-wrapper.c
+++ b/arch/powerpc/boot/libfdt-wrapper.c
@@ -51,7 +51,7 @@
 #define devp_offset_find(devp) (((int)(devp))-1)
 #define devp_offset(devp)  (devp ? ((int)(devp))-1 : 0)
 
-static void *fdt;
+void *fdt;
 static void *buf; /* = NULL */
 
 #define EXPAND_GRANULARITY 1024
diff --git a/arch/powerpc/boot/ops.h b/arch/powerpc/boot/ops.h
--- a/arch/powerpc/boot/ops.h
+++ b/arch/powerpc/boot/ops.h
@@ -14,6 +14,7 @@
 #include stddef.h
 #include types.h
 #include string.h
+#include libfdt_env.h
 
 #defineCOMMAND_LINE_SIZE   512
 #defineMAX_PATH_LEN256
@@ -32,6 +33,9 @@ struct platform_ops {
void *  (*vmlinux_alloc)(unsigned long size);
 };
 extern struct platform_ops platform_ops;
+
+/* The device tree itself. Should almost always be accessed via dt_ops. */
+extern void *fdt;
 
 /* Device Tree operations */
 struct dt_ops {
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-11 Thread David Gibson
On Tue, Nov 11, 2008 at 06:06:46PM -0600, Hollis Blanchard wrote:
 The current CHIP11 errata truncates the device tree memory node, and subtracts
 (hardcoded) 4096 bytes. This breaks kernels with larger PAGE_SIZE, since the
 bootmem allocator assumes that total memory is a multiple of PAGE_SIZE.
 
 Instead, use a device tree memory reservation to reserve only the 256 bytes
 actually affected by the errata, leaving the total memory size unaltered.
 
 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

libfdt usage changes look fine to me.

Acked-by: David Gibson [EMAIL PROTECTED]

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] [v3] powerpc/4xx: work around CHIP11 errata in a more PAGE_SIZE-friendly way

2008-11-11 Thread Benjamin Herrenschmidt
On Tue, 2008-11-11 at 18:06 -0600, Hollis Blanchard wrote:
 The current CHIP11 errata truncates the device tree memory node, and subtracts
 (hardcoded) 4096 bytes. This breaks kernels with larger PAGE_SIZE, since the
 bootmem allocator assumes that total memory is a multiple of PAGE_SIZE.
 
 Instead, use a device tree memory reservation to reserve only the 256 bytes
 actually affected by the errata, leaving the total memory size unaltered.
 
 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

While I prefer this approach, won't it break kexec ?

I don't understand why we don't just have a bit of code in the kernel
itself that reserve that page on 44x at boot time and be done with it.

It's like we are trying to be too smart and over-engineer the solution.

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev