RE: Disabling L1 D-cache and side effects

2008-09-30 Thread Benjamin Herrenschmidt
On Mon, 2008-09-29 at 14:38 -0700, Tirumala Reddy Marri wrote:
 Could you please point me to the which does the Critical error (Machine
 Check) recovery. BTW I am successful booting the Linux until rootfs is
 being mounted. It fails to mount the Linux saying that blocks are
 corrupted in file system. I had to modify lots of initial bring up code
 to disable D-cache and make sure all TLB's are cache inhibited. Ando
 also made sure none of the misc_32.S , entry_32.S and head.S makes any
 references to d-cache.

Why the heck are you doing that btw ? AFAIK, as Olof says, things like
atomic operations will not work, dcbz neither etc... it's likely that
even if you manage to plaster around all of this in the kernel, whatever
userspace code you'll try to run in userspace will blow up too...

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: New dma-noncoherent code, looking for comment and people to test

2008-09-30 Thread Benjamin Herrenschmidt
On Mon, 2008-09-29 at 10:26 -0700, Remi Machet wrote:
 
 I also removed the HIGHMEM support in dma_sync since memory allocated for
 DMA transfer should always be in ZONE_DMA (ie not in ZONE_HIGHMEM).

While I like the idea of simplifying that stuff, the above sentence is
incorrect unfortunately.

ZONE_DMA is an artifact of x86 ISA DMA limitations. You -will- get
request for mapping pages for DMA that have been allocated within
different zones (notably highmem).

The problem with highmem is that whether you can or not DMA to/from
highmem is somewhat unclear, drivers set flags individually in various
layers to allow it, which is definitely not the right place to do so. So
while it would be nice to think we never will, in practice, we do.

 Looking forward to any comment about why this code may not work or is not
 as good as the original. If you do test this code on your platform, let me
 know how it goes ... if no-one object and no bug is found I will submit
 this patch in a month or so.

Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: New dma-noncoherent code, looking for comment and people to test

2008-09-30 Thread Benjamin Herrenschmidt
On Mon, 2008-09-29 at 13:03 -0500, Kumar Gala wrote:
 
 We really should change this code over to the new dma changes Becky's  
 introduced so we just have a non-coherent set of DMA ops (thus we can  
 do both non-coherent and coherent in the same system).

Yes :-)

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Please pull from 'dma' branch

2008-09-30 Thread Benjamin Herrenschmidt

  Becky Bruce (5):
   powerpc: Rename dma_64.c to dma.c
   powerpc: Move iommu dma ops from dma.c to dma-iommu.c
   powerpc: Drop archdata numa_node
   powerpc: Merge 32 and 64-bit dma code
   powerpc: Make dma_addr_t a u64 if CONFIG_PHYS_64BIT is set
 
 Paul,
 
 poke..

Paul's on leave this week, I'll pick up your stuff.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] powerpc/boot: add kernel,end node to the cuboot target

2008-09-30 Thread Milton Miller


On Sep 29, 2008, at 3:04 PM, Sebastian Siewior wrote:


* Milton Miller | 2008-09-23 20:24:02 [-0500]:


If you have any questions about kdump or what needs to happen,
please feel free to contact me


I copied most of the 64bit code to parse the device tree without the 
pci

nodes  moved it to 32. The userland *could* work, I'm not sure. My
outout is:

|load: entry = 0x80053c flags = 0
|nr_segments = 2
|segment[0].buf   = 0x1002b8f0
|segment[0].bufsz = 80
|segment[0].mem   = (nil)
|segment[0].memsz = 1000
|segment[1].buf   = 0x4803f008
|segment[1].bufsz = 3a3138
|segment[1].mem   = 0x80
|segment[1].memsz = 3b


I would expect a third segment (kernel/zImage, dtb, and purgatory), but 
its not clear that you are getting that far yet.


Now. The entry address in image-start is valid and is the entrypoint 
of

the custom cuImage. Custom means that it does not depend any register
values passed from u-boot (the original one needs a pointer to bd_t).
The only requirement is a valid 1:1 memory mapping.


ok sounds good.  does this have the dtb in it too?


I learned, that I can not disable the MMU on Book-E so I have to create
a new temporary mapping in my relocate_new_kernel routine. _start is
doing the same thing what I am trying to accomplish: create a new
mapping and don't kill the current one and switch over. This is done by
disabling all mappings but the current, creating a new mapping with
EFN/RPN = 0 and swapping the TS bit in MAS1. This is my current patch
which is not really working:


I have never actually written or debugged any book-E code, and this 
deals directly with that.  However, a quick read of ePAPR chapter 5 
suggests that rather than just the other TS, you want to actively 
decide that the control will transfer to TS0, and establishing the 
mappings there.  Again, I'm not faimilar with the book-E code, but the 
kernel itself will use the upper 1/4 of the effective address space by 
default, so the low offset will likely be available.



diff --git a/arch/powerpc/kernel/misc_32.S 
b/arch/powerpc/kernel/misc_32.S

index 7a6dfbc..49c9c2a 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -878,22 +878,142 @@ relocate_new_kernel:
/* r4 = reboot_code_buffer */
/* r5 = start_address  */

-   li  r0, 0
+   mr r27, r4
+   mr r28, r5
+   mr r29, r6
+
+   li  r25,0   /* phys kernel start (low) */
+   li  r24,0   /* CPU number */
+   li  r23,0   /* phys kernel start (high) */
+
+
+/* 1. Find the index of the entry we're executing in */
+   bl  invstr  /* Find our address */
+invstr: mflrr6  /* Make it accessible 
*/

+   mfmsr   r7
+   rlwinm  r4,r7,27,31,31  /* extract MSR[IS] */
+   mfspr   r7, SPRN_PID0
+   slwir7,r7,16
+   or  r7,r7,r4
+   mtspr   SPRN_MAS6,r7
+	tlbsx   0,r6/* search MSR[IS], SPID=PID0 
*/

+#ifndef CONFIG_E200
+   mfspr   r7,SPRN_MAS1
+   andis.  r7,r7,[EMAIL PROTECTED]
+   bne match_TLB

The branch above is taken, so I've found my current mapping


ok, but should you not be using PID0 explictly to say global only?



+   mfspr   r7,SPRN_PID1
+   slwir7,r7,16
+   or  r7,r7,r4
+   mtspr   SPRN_MAS6,r7
+	tlbsx   0,r6/* search MSR[IS], SPID=PID1 
*/

+   mfspr   r7,SPRN_MAS1
+   andis.  r7,r7,[EMAIL PROTECTED]
+   bne match_TLB
+   mfspr   r7, SPRN_PID2
+   slwir7,r7,16
+   or  r7,r7,r4
+   mtspr   SPRN_MAS6,r7
+	tlbsx   0,r6/* Fall through, we had to 
match */

+#endif
+match_TLB:
+
+   rlwinm  r3,r7,16,20,31  /* Extract MAS0(Entry) */
+
+   mfspr   r7,SPRN_MAS1/* Insure IPROT set */
+   orisr7,r7,[EMAIL PROTECTED]
+   mtspr   SPRN_MAS1,r7
+   tlbwe
+
+/* 2. Invalidate all entries except the entry we're executing in */
+   mfspr   r9,SPRN_TLB1CFG
+   andi.   r9,r9,0xfff
+   li  r6,0/* Set Entry counter to 0 */
+1:  lis r7,0x1000   /* Set MAS0(TLBSEL) = 
1 */
+	rlwimi  r7,r6,16,4,15   /* Setup MAS0 = TLBSEL | 
ESEL(r6) */

+   mtspr   SPRN_MAS0,r7
+   tlbre
+   mfspr   r7,SPRN_MAS1
+	rlwinm  r7,r7,0,2,31/* Clear MAS1 Valid and 
IPROT */

+   cmpwr3,r6
+	beq skpinv/* Dont update the current 
execution TLB */

+   mtspr   SPRN_MAS1,r7
+   tlbwe
+   isync
+skpinv: addir6,r6,1 /* Increment */
+   cmpwr6,r9   /* Are we done? */
+   bne 1b  /* If not, repeat */

-   /*
-* Set Machine Status Register to a known status,
-* switch the MMU off and jump to 1: 

Re: [PATCH] sputrace : use marker_synchronize_unregister()

2008-09-30 Thread Ingo Molnar

* Jeremy Kerr [EMAIL PROTECTED] wrote:

 Mathieu,
 
  We need a marker_synchronize_unregister() before the end of exit() to
  make sure every probe callers have exited the non preemptible section
  and thus are not executing the probe code anymore.
 
 Looks good - added to spufs.git.

that wont work very well as the patch relies on the new 
marker_synchronize_unregister() facility.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: How to prevent embedded ppc reset deadlock? (MPC83xx/85xx)

2008-09-30 Thread Leon Woestenberg
André,

On Tue, Sep 30, 2008 at 12:05 AM, Leon Woestenberg
[EMAIL PROTECTED] wrote:
 Since you also have to assert HRESET when you assert PORESET

 But when I assert PORESET, the processor will assert HRESET itself
 AFAIK, so why do this?

 you can wire-or them with a low drop schottky diode.


MPC8313E PowerQUICC II Pro Integrated Processor Family Reference
Manual, Rev. 1, section 4.2.2, page 4-6:

Directly after the negation of PORESET, the device starts the
configuration process. The device asserts HRESET throughout the
power-on reset process, including configuration

So, I will not drive HRESET myself but depend on the ppc to drive it for me.

Regards,
-- 
Leon
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Cbe-oss-dev] [PATCH] sputrace : use marker_synchronize_unregister()

2008-09-30 Thread Jeremy Kerr
Ingo,

 that wont work very well as the patch relies on the new
 marker_synchronize_unregister() facility.

d'oh, right you are. Should I leave this in your hands to merge?

Cheers,


Jeremy
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Cbe-oss-dev] [PATCH] sputrace : use marker_synchronize_unregister()

2008-09-30 Thread Ingo Molnar

* Jeremy Kerr [EMAIL PROTECTED] wrote:

 Ingo,
 
  that wont work very well as the patch relies on the new
  marker_synchronize_unregister() facility.
 
 d'oh, right you are. Should I leave this in your hands to merge?

would be nice if you could give your Acked-by for the sputrace bits, 
then we can merge it. It's a oneliner so it shouldnt cause merging 
trouble in linux-next.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Cbe-oss-dev] [PATCH] sputrace : use marker_synchronize_unregister()

2008-09-30 Thread Jeremy Kerr
Ingo,

 would be nice if you could give your Acked-by for the sputrace bits,
 then we can merge it. It's a oneliner so it shouldnt cause merging
 trouble in linux-next.

Sure!

Acked-by: Jeremy Kerr [EMAIL PROTECTED]


Jeremy
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/2] OF: add fsl,mcu-mpc8349emitx to the exception list

2008-09-30 Thread Anton Vorontsov
On Tue, Sep 23, 2008 at 06:12:19PM +0400, Anton Vorontsov wrote:
 of/base.c matches on the first (most specific) entries, which isn't
 quite practical but it was discussed[1] that this won't change.
 
 The bindings specifies verbose information for the devices, but
 it doesn't fit in the I2C ID's 20 characters limit. The limit won't
 change[2], and the bindings won't change either as they're correct.
 
 So we have to put an exception for the MPC8349E-mITX-compatible
 MCUs.
 
 [1] http://www.mail-archive.com/linuxppc-dev@ozlabs.org/msg21196.html
 [2] 
 http://www.nabble.com/-PATCH-1-2--i2c:-expand-I2C's-id.name-to-23-characters-td19577063.html
 
 Signed-off-by: Anton Vorontsov [EMAIL PROTECTED]
 ---
  drivers/of/base.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

Any issues with this or the second patch? Can we merge them?


Thanks,

 diff --git a/drivers/of/base.c b/drivers/of/base.c
 index ad8ac1a..a726464 100644
 --- a/drivers/of/base.c
 +++ b/drivers/of/base.c
 @@ -410,7 +410,7 @@ struct of_modalias_table {
   char *modalias;
  };
  static struct of_modalias_table of_modalias_table[] = {
 - /* Empty for now; add entries as needed */
 + { fsl,mcu-mpc8349emitx, mcu-mpc8349emitx },
  };
  
  /**
 -- 
 1.5.6.3

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] Remove legacy kdump kernel build support

2008-09-30 Thread Mohan Kumar M
Remove legacy kdump kernel build support

This patch removes legacy kdump kernel build support(i.e compiling  a
kdump kernel at a fixed hardcoded address 32MB). With the relocatable
kernel support its now possible to use the regular kernel binary for
capturing the dump also. Relocatable kdump kernel does not require
trampoline code for exception handlers. It removes kdump.h as most of the
macros defined in kdump.h are not used any more. Also the relocatable
kdump kernel allows us to load kdump kernel any where as specified by
crashkernel parameter instead of hardcoded address 32MB.

Signed-off-by: Mohan Kumar M [EMAIL PROTECTED]
---
 arch/powerpc/Kconfig|2 -
 arch/powerpc/include/asm/iommu.h|5 
 arch/powerpc/include/asm/kdump.h|   35 --
 arch/powerpc/include/asm/page.h |1 -
 arch/powerpc/kernel/crash.c |1 -
 arch/powerpc/kernel/crash_dump.c|   40 ---
 arch/powerpc/kernel/iommu.c |1 -
 arch/powerpc/kernel/machine_kexec.c |5 
 arch/powerpc/kernel/prom.c  |2 -
 arch/powerpc/kernel/setup_64.c  |3 --
 10 files changed, 5 insertions(+), 90 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/kdump.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 17c988b..cad6035 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -824,11 +824,9 @@ config PAGE_OFFSET
default 0xc000
 config KERNEL_START
hex
-   default 0xc200 if CRASH_DUMP
default 0xc000
 config PHYSICAL_START
hex
-   default 0x0200 if CRASH_DUMP
default 0x
 endif
 
diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 51ecfef..1ee9b12 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -35,6 +35,11 @@
 #define IOMMU_PAGE_MASK   (~((1  IOMMU_PAGE_SHIFT) - 1))
 #define IOMMU_PAGE_ALIGN(addr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE)
 
+#ifdef CONFIG_CRASH_DUMP
+#define KDUMP_MIN_TCE_ENTRIES  2048
+extern unsigned long long __kdump_flag;
+#endif
+
 /* Boot time flags */
 extern int iommu_is_off;
 extern int iommu_force_on;
diff --git a/arch/powerpc/include/asm/kdump.h b/arch/powerpc/include/asm/kdump.h
deleted file mode 100644
index f6c93c7..000
--- a/arch/powerpc/include/asm/kdump.h
+++ /dev/null
@@ -1,35 +0,0 @@
-#ifndef _PPC64_KDUMP_H
-#define _PPC64_KDUMP_H
-
-/* Kdump kernel runs at 32 MB, change at your peril. */
-#define KDUMP_KERNELBASE   0x200
-
-/* How many bytes to reserve at zero for kdump. The reserve limit should
- * be greater or equal to the trampoline's end address.
- * Reserve to the end of the FWNMI area, see head_64.S */
-#define KDUMP_RESERVE_LIMIT0x1 /* 64K */
-
-#ifdef CONFIG_CRASH_DUMP
-
-#define KDUMP_TRAMPOLINE_START 0x0100
-#define KDUMP_TRAMPOLINE_END   0x3000
-
-#define KDUMP_MIN_TCE_ENTRIES  2048
-
-#endif /* CONFIG_CRASH_DUMP */
-
-#ifndef __ASSEMBLY__
-#ifdef CONFIG_CRASH_DUMP
-
-extern void reserve_kdump_trampoline(void);
-extern void setup_kdump_trampoline(void);
-
-#else /* !CONFIG_CRASH_DUMP */
-
-static inline void reserve_kdump_trampoline(void) { ; }
-static inline void setup_kdump_trampoline(void) { ; }
-
-#endif /* CONFIG_CRASH_DUMP */
-#endif /* __ASSEMBLY__ */
-
-#endif /* __PPC64_KDUMP_H */
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 64e1445..00dedf1 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -11,7 +11,6 @@
  */
 
 #include asm/asm-compat.h
-#include asm/kdump.h
 #include asm/types.h
 
 /*
diff --git a/arch/powerpc/kernel/crash.c b/arch/powerpc/kernel/crash.c
index 0a8439a..4a89934 100644
--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -29,7 +29,6 @@
 #include asm/processor.h
 #include asm/machdep.h
 #include asm/kexec.h
-#include asm/kdump.h
 #include asm/prom.h
 #include asm/firmware.h
 #include asm/smp.h
diff --git a/arch/powerpc/kernel/crash_dump.c b/arch/powerpc/kernel/crash_dump.c
index a323c9b..7d3134c 100644
--- a/arch/powerpc/kernel/crash_dump.c
+++ b/arch/powerpc/kernel/crash_dump.c
@@ -15,7 +15,6 @@
 #include linux/bootmem.h
 #include linux/lmb.h
 #include asm/code-patching.h
-#include asm/kdump.h
 #include asm/prom.h
 #include asm/firmware.h
 #include asm/uaccess.h
@@ -27,45 +26,6 @@
 #define DBG(fmt...)
 #endif
 
-void __init reserve_kdump_trampoline(void)
-{
-   lmb_reserve(0, KDUMP_RESERVE_LIMIT);
-}
-
-static void __init create_trampoline(unsigned long addr)
-{
-   unsigned int *p = (unsigned int *)addr;
-
-   /* The maximum range of a single instruction branch, is the current
-* instruction's address + (32 MB - 4) bytes. For the trampoline we
-* need to branch to current address + 32 MB. So we insert a nop at
-* the trampoline address, then the next instruction (+ 4 bytes)
-* does a branch 

[PATCH 2/2] Support for relocatable kdump kernel

2008-09-30 Thread Mohan Kumar M
Support for relocatable kdump kernel

This patch adds relocatable kernel support for kdump. With this one can
use the same regular kernel to capture the kdump. A signature (0xfeed1234)
is passed in r8 from panic code to the next kernel through kexec_sequence
and purgatory code. The signature is used to differentiate between
relocatable kdump kernel and non-kdump kernels.

The purgatory code compares the signature and sets the __kdump_flag in
head_64.S.  During the boot up, kernel code checks __kdump_flag and if it
is set, the kernel will behave as relocatable kdump kernel. This kernel
will boot at the address where it was loaded by kexec-tools ie at the
address reserved through crashkernel boot parameter.

Now for kdump, both CONFIG_RELOCATABLE and CONFIG_CRASH_DUMP should be
enabled and the same kernel can be used as production and kdump kernel.

Signed-off-by: Mohan Kumar M [EMAIL PROTECTED]
---
 Documentation/kdump/kdump.txt  |   17 +++--
 arch/powerpc/include/asm/kexec.h   |6 +++
 arch/powerpc/kernel/head_64.S  |   60 +---
 arch/powerpc/kernel/iommu.c|2 +-
 arch/powerpc/kernel/machine_kexec_64.c |   12 --
 arch/powerpc/kernel/misc_64.S  |   10 --
 6 files changed, 90 insertions(+), 17 deletions(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 0705040..95a9c95 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -104,12 +104,14 @@ Build the system and dump-capture kernels
 There are two possible methods of using Kdump.
 
 1) Build a separate custom dump-capture kernel for capturing the
-   kernel core dump.
+   kernel core dump. Legacy kdump kernel build support(i.e separate kdump
+   kernel) for ppc64 is not available.
 
 2) Or use the system kernel binary itself as dump-capture kernel and there is
no need to build a separate dump-capture kernel. This is possible
only with the architecutres which support a relocatable kernel. As
-   of today, i386, x86_64 and ia64 architectures support relocatable kernel.
+   of today, i386, x86_64, ppc64 and ia64 architectures support relocatable
+   kernel.
 
 Building a relocatable kernel is advantageous from the point of view that
 one does not have to build a second kernel for capturing the dump. But
@@ -207,8 +209,15 @@ Dump-capture kernel config options (Arch Dependent, i386 
and x86_64)
 Dump-capture kernel config options (Arch Dependent, ppc64)
 --
 
-*  Make and install the kernel and its modules. DO NOT add this kernel
-   to the boot loader configuration files.
+1) Enable Build a kdump crash kernel support under Kernel options:
+
+   CONFIG_CRASH_DUMP=y
+
+2)   Enable Build a relocatable kernel support
+
+   CONFIG_RELOCATABLE=y
+
+   Make and install the kernel and its modules.
 
 Dump-capture kernel config options (Arch Dependent, ia64)
 --
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 3736d9b..765b318 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -24,6 +24,12 @@
 
 #define KEXEC_CONTROL_PAGE_SIZE 4096
 
+/*
+ * Used to differentiate between relocatable kdump kernel and other
+ * kernels
+ */
+#define KDUMP_SIGNATURE0xfeed1234
+
 /* The native architecture */
 #ifdef __powerpc64__
 #define KEXEC_ARCH KEXEC_ARCH_PPC64
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 84856be..bbe1617 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -97,6 +97,14 @@ __secondary_hold_spinloop:
 __secondary_hold_acknowledge:
.llong  0x0
 
+   /* This flag is set only for kdump kernels so that */
+   /* it will be relocatable. Purgatory code user space kexec-tools */
+   /* sets this flag. Do not move this variable as purgatory code */
+   /* relies on the position of this variables */
+   .globl  __kdump_flag
+__kdump_flag:
+   .llong  0x0
+
 #ifdef CONFIG_PPC_ISERIES
/*
 * At offset 0x20, there is a pointer to iSeries LPAR data.
@@ -1384,7 +1392,15 @@ _STATIC(__after_prom_start)
/* process relocations for the final address of the kernel */
lis r25,[EMAIL PROTECTED]   /* compute virtual base of kernel */
sldir25,r25,32
-   mr  r3,r25
+#ifdef CONFIG_CRASH_DUMP
+   ld  r7,[EMAIL PROTECTED](r2)
+   add r7,r7,r26
+   ld  r7,0(r7)
+   cmpldi  cr0,r7,1/* relocatable kernel ? */
+   bne 1f
+   add r25,r25,r26
+#endif
+1: mr  r3,r25
bl  .relocate
 #endif
 
@@ -1398,10 +1414,26 @@ _STATIC(__after_prom_start)
li  r3,0/* target addr */
mr. r4,r26  /* In some cases the loader may  */
beq 9f  /* have already put us at 

[PATCH] Relocatable kdump kernel support in kexec-tools

2008-09-30 Thread Mohan Kumar M
Relocatable kdump kernel support in kexec-tools

This patch adds relocatable kernel support for kdump in the kexec-tools
code. A signature (0xfeed1234) is passed in r6 from panic code to the
purgatory code through kexec_sequence function. The signature is used to
differentiate between relocatable kdump kernel and non-kdump kernels.

The purgatory code compares the signature and sets the __kdump_flag in
head_64.S by using the offset with respect to next kernel load address.
During the boot up, kernel code checks __kdump_flag and if it is set, the
kernel will behave as relocatable kdump kernel.

Signed-off-by: Mohan Kumar M [EMAIL PROTECTED]
---
diff --git a/purgatory/arch/ppc64/v2wrap.S b/purgatory/arch/ppc64/v2wrap.S
index b3563de..f69dad2 100644
--- a/purgatory/arch/ppc64/v2wrap.S
+++ b/purgatory/arch/ppc64/v2wrap.S
@@ -45,6 +45,7 @@
orisrn,rn,[EMAIL PROTECTED]; \
ori rn,rn,[EMAIL PROTECTED]
 
+#define KDUMP_SIGNATURE 0xfeed1234
 
.machine ppc64
.globl purgatory_start
@@ -64,6 +65,7 @@ master:
isync
mr  17,3# save cpu id to r17
mr  15,4# save physical address in reg15
+   mr  18,6# save kdump flag in reg18
 
LOADADDR(6,my_toc)
ld  2,0(6)  #setup toc
@@ -94,6 +96,12 @@ master:
mtctr   4   # prepare branch too
mr  3,16# restore dt address
 
+   LOADADDR(6,KDUMP_SIGNATURE)
+   cmpd18,6
+   bne regular
+   li  7,1
+   std 7,24(4) # mark kdump flag at kernel
+regular:
lwz 7,0(4)  # get the first instruction that we stole
stw 7,0(0)  # and put it in the slave loop at 0
# skip cache flush, do we care?
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread Martyn Welch
Support for the SBC610 VPX Single Board Computer from GE Fanuc (PowerPC 
MPC8641D).

A number of MPC8641D based route interrupts for on-board interrupts through
a FPGA based interrupt controller, which is chained with the
MPC8641D's mpic. This patch provides a basic driver to allow basic routing
of interrupts to the mpic.

Signed-off-by: Martyn Welch [EMAIL PROTECTED]
---

 arch/powerpc/boot/dts/gef_sbc610.dts |   38 
 arch/powerpc/platforms/86xx/gef_sbc610.c |   25 +++
 arch/powerpc/sysdev/Makefile |2 
 arch/powerpc/sysdev/gef_pic.c|  258 ++
 arch/powerpc/sysdev/gef_pic.h|   11 +
 5 files changed, 329 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/boot/dts/gef_sbc610.dts 
b/arch/powerpc/boot/dts/gef_sbc610.dts
index 80b79e4..d7a591b 100644
--- a/arch/powerpc/boot/dts/gef_sbc610.dts
+++ b/arch/powerpc/boot/dts/gef_sbc610.dts
@@ -67,6 +67,36 @@
reg = 0x0 0x4000; // set by uboot
};
 
+   [EMAIL PROTECTED] {
+   #address-cells = 2;
+   #size-cells = 1;
+   compatible = fsl,mpc8641-localbus, simple-bus;
+   reg = 0xf8005000 0x1000;
+   interrupts = 19 2;
+   interrupt-parent = mpic;
+
+   ranges = 0 0 0xff00 0x0100 // 16MB Boot flash
+ 1 0 0xe800 0x0800 // Paged Flash 0
+ 2 0 0xe000 0x0800 // Paged Flash 1
+ 3 0 0xfc10 0x0002 // NVRAM
+ 4 0 0xfc00 0x8000 // FPGA
+ 5 0 0xfc008000 0x8000 // AFIX FPGA
+ 6 0 0xfd00 0x0080 // IO FPGA (8-bit)
+ 7 0 0xfd80 0x0080;   // IO FPGA (32-bit)
+
+   gef_pic: [EMAIL PROTECTED],4000 {
+   #interrupt-cells = 2;
+   interrupt-controller;
+   device_type = interrupt-controller;
+   compatible = gef,fpga-pic;
+   reg = 0x4 0x4000 0x20;
+   interrupts = 0x8 0x1
+ 0x9 0x1;
+   interrupt-parent = mpic;
+
+   };
+   };
+
[EMAIL PROTECTED] {
#address-cells = 1;
#size-cells = 1;
@@ -150,13 +180,13 @@
reg = 0x24520 0x20;
 
phy0: [EMAIL PROTECTED] {
-   interrupt-parent = mpic;
-   interrupts = 0x0 0x1;
+   interrupt-parent = gef_pic;
+   interrupts = 0x9 0x4;
reg = 1;
};
phy2: [EMAIL PROTECTED] {
-   interrupt-parent = mpic;
-   interrupts = 0x0 0x1;
+   interrupt-parent = gef_pic;
+   interrupts = 0x8 0x4;
reg = 3;
};
};
diff --git a/arch/powerpc/platforms/86xx/gef_sbc610.c 
b/arch/powerpc/platforms/86xx/gef_sbc610.c
index 605678c..8b8bbb5 100644
--- a/arch/powerpc/platforms/86xx/gef_sbc610.c
+++ b/arch/powerpc/platforms/86xx/gef_sbc610.c
@@ -37,6 +37,7 @@
 
 #include sysdev/fsl_pci.h
 #include sysdev/fsl_soc.h
+#include sysdev/gef_pic.h
 
 #include mpc86xx.h
 
@@ -48,6 +49,28 @@
 #define DBG (fmt...) do { } while (0)
 #endif
 
+void __iomem *sbc610_regs;
+
+static void __init gef_sbc610_init_irq(void)
+{
+   struct device_node *cascade_node = NULL;
+
+   mpc86xx_init_irq();
+
+   /*
+* There is a simple interrupt handler in the main FPGA, this needs
+* to be cascaded into the MPIC
+*/
+   cascade_node = of_find_compatible_node(NULL, NULL, gef,fpga-pic);
+   if (!cascade_node) {
+   printk(KERN_WARNING SBC610: No FPGA PIC\n);
+   return;
+   }
+
+   gef_pic_init(cascade_node);
+   of_node_put(cascade_node);
+}
+
 static void __init gef_sbc610_setup_arch(void)
 {
 #ifdef CONFIG_PCI
@@ -153,7 +176,7 @@ define_machine(gef_sbc610) {
.name   = GE Fanuc SBC610,
.probe  = gef_sbc610_probe,
.setup_arch = gef_sbc610_setup_arch,
-   .init_IRQ   = mpc86xx_init_irq,
+   .init_IRQ   = gef_sbc610_init_irq,
.show_cpuinfo   = gef_sbc610_show_cpuinfo,
.get_irq= mpic_get_irq,
.restart= fsl_rstcr_restart,
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index b6c269e..ef96f71 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -50,3 +50,5 @@ endif
 ifeq ($(CONFIG_SUSPEND),y)
 obj-$(CONFIG_6xx)  += 

Re: [PATCH] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread Kumar Gala


On Sep 30, 2008, at 9:29 AM, Martyn Welch wrote:



arch/powerpc/boot/dts/gef_sbc610.dts |   38 
arch/powerpc/platforms/86xx/gef_sbc610.c |   25 +++
arch/powerpc/sysdev/Makefile |2
arch/powerpc/sysdev/gef_pic.c|  258 + 
+

arch/powerpc/sysdev/gef_pic.h|   11 +


The gef_pic should really live in platforms/86xx/ since its specific  
to your board.


- k
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] properly reserve in bootmem the lmb reserved regions that cross numa nodes

2008-09-30 Thread Jon Tollefson
If there are multiple reserved memory blocks via lmb_reserve() that are 
contiguous addresses and on different numa nodes we are losing track of which 
address ranges to reserve in bootmem on which node.  I discovered this 
when I only recently got to try 16GB huge pages on a system with more 
then 2 nodes.

When scanning the device tree in early boot we call lmb_reserve() with 
the addresses of the 16G pages that we find so that the memory doesn't 
get used for something else.  For example the addresses for the pages 
could be 40, 44, 48, 4C, etc - 8 pages, 
one on each of eight nodes.  In the lmb after all the pages have been 
reserved it will look something like the following:

lmb_dump_all:
memory.cnt= 0x2
memory.size   = 0x3e8000
memory.region[0x0].base   = 0x0
  .size = 0x1e8000
memory.region[0x1].base   = 0x40
  .size = 0x20
reserved.cnt  = 0x5
reserved.size = 0x3e8000
reserved.region[0x0].base   = 0x0
  .size = 0x7b5000
reserved.region[0x1].base   = 0x2a0
  .size = 0x78c000
reserved.region[0x2].base   = 0x328c000
  .size = 0x43000
reserved.region[0x3].base   = 0xf4e8000
  .size = 0xb18000
reserved.region[0x4].base   = 0x40
  .size = 0x20


The reserved.region[0x4] contains the 16G pages.  In 
arch/powerpc/mm/num.c: do_init_bootmem() we loop through each of the 
node numbers looking for the reserved regions that belong to the 
particular node.  It is not able to identify region 0x4 as being a part 
of each of the 8 nodes.  It is assuming that a reserved region is only
on a single node.

This patch takes out the reserved region loop from inside
the loop that goes over each node.  It looks up the active region containing
the start of the reserved region.  If it extends past that active region then
it adjusts the size and gets the next active region containing it.


Signed-off-by: Jon Tollefson [EMAIL PROTECTED]
---


 arch/powerpc/mm/numa.c |   63 -
 include/linux/mm.h |2 +
 mm/page_alloc.c|   19 ++
 3 files changed, 57 insertions(+), 27 deletions(-)


diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index d9a1813..07b8726 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -837,36 +837,45 @@ void __init do_init_bootmem(void)
  start_pfn, end_pfn);

free_bootmem_with_active_regions(nid, end_pfn);
+   }

-   /* Mark reserved regions on this node */
-   for (i = 0; i  lmb.reserved.cnt; i++) {
-   unsigned long physbase = lmb.reserved.region[i].base;
-   unsigned long size = lmb.reserved.region[i].size;
-   unsigned long start_paddr = start_pfn  PAGE_SHIFT;
-   unsigned long end_paddr = end_pfn  PAGE_SHIFT;
-
-   if (early_pfn_to_nid(physbase  PAGE_SHIFT) != nid 
-   early_pfn_to_nid((physbase+size-1)  PAGE_SHIFT) 
!= nid)
-   continue;
-
-   if (physbase  end_paddr 
-   (physbase+size)  start_paddr) {
-   /* overlaps */
-   if (physbase  start_paddr) {
-   size -= start_paddr - physbase;
-   physbase = start_paddr;
-   }
-
-   if (size  end_paddr - physbase)
-   size = end_paddr - physbase;
-
-   dbg(reserve_bootmem %lx %lx\n, physbase,
-   size);
-   reserve_bootmem_node(NODE_DATA(nid), physbase,
-size, BOOTMEM_DEFAULT);
-   }
+   /* Mark reserved regions */
+   for (i = 0; i  lmb.reserved.cnt; i++) {
+   unsigned long physbase = lmb.reserved.region[i].base;
+   unsigned long size = lmb.reserved.region[i].size;
+   unsigned long start_pfn = physbase  PAGE_SHIFT;
+   unsigned long end_pfn = ((physbase+size-1)  PAGE_SHIFT);
+   struct node_active_region *node_ar;
+
+   node_ar = get_node_active_region(start_pfn);
+   while (start_pfn  end_pfn  node_ar != NULL) {
+   /*
+* if reserved region extends past active region
+* then trim size to active region
+*/
+   if (end_pfn = node_ar-end_pfn)
+   

Re: [PATCH] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread Martyn Welch
On Tue, 30 Sep 2008 09:50:58 -0500
Kumar Gala [EMAIL PROTECTED] wrote:
 
 On Sep 30, 2008, at 9:29 AM, Martyn Welch wrote:
 
 
  arch/powerpc/boot/dts/gef_sbc610.dts |   38 
  arch/powerpc/platforms/86xx/gef_sbc610.c |   25 +++
  arch/powerpc/sysdev/Makefile |2
  arch/powerpc/sysdev/gef_pic.c|  258 + 
  +
  arch/powerpc/sysdev/gef_pic.h|   11 +
 
 The gef_pic should really live in platforms/86xx/ since its specific  
 to your board.
 

Ah, ok. I'll move it and resubmit.

 - k


-- 
Martyn Welch MEng MPhil MIET (Principal Software Engineer)   T:+44(0)1327322748
GE Fanuc Intelligent Platforms Ltd,|Registered in England and Wales
Tove Valley Business Park, Towcester,  |(3828642) at 100 Barbirolli Square,
Northants, NN12 6PF, UK T:+44(0)1327359444 |Manchester,M2 3AB  VAT:GB 729849476
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: USB support on mpc5200 broken

2008-09-30 Thread Matt Sealey


David Gibson wrote:


This, of course, is exactly why I *don't* recommend embedded platforms
move to including the device tree in the flashed firmware.  Keeping
the device tree in the bootwrapper means that it *is* updated with the
kernel and we don't have to mess around with as much backwards
compatibility junk.


Pardon my language, but this is such bullshit.

This isn't including a device tree in flashed firmware, this is
having a real Open Firmware. We don't embed anything in there, it's
procedurally generated on each boot.

Our whole problem here is that we have a device tree which was fixed
for production before the device tree specification was nailed down
for the MPC5200B, and it's still in flux. We can't be expected to
walk lock-step with a 3 month kernel development cycle and we certainly
do not appreciate sidelining real firmware in favor of static device
trees which need to be compiled *per board*.

All the FDT does is move a lot of extra hardcoded values out of the
kernel and into a just-as-annoying extra file you need to be wary of
keeping up to date since the format and specification changes so much.

We never had this much whining about Apple's device tree, people just
implemented the workarounds..

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: USB support on mpc5200 broken

2008-09-30 Thread Matt Sealey

Jon Smirl wrote:



Efika has this:
compatible = fsl,mpc5200b-ohci,fsl,mpc5200-ohci;


It doesn't :D

My system, running production firmware, says

ohci-bigendian,ohci-be,mpc5200-ohci,mpc5200-usb

This is what we were recommended to use at the time. There is a patch
on www.powerdeveloper.org which tweaks the tree to make it ultra-compliant
with the Linux version of things, which implements every variation. It
also implements a suggested patch which added a big-endian property
(not built in to the compatible property, but another property).

I don't see why THAT patch got reverted as it was a great idea that we
all agreed was a great idea.

Linux development around here is getting really schizophrenic. Nobody
is writing these decisions down even as comments in the source code..


If we really need a big endian flag, should it be an attribute?


Yes.


Shouldn't the driver already know it is being used on a BE machine?


No; you can have little endian OHCI controllers on big endian machines.
It's a property of the host controller, not the system architecture, just
like PCI is always little endian (except when you have magic in hardware
like Amiga PowerUP cards which endianswap for you :)

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Device Tree

2008-09-30 Thread Matt Sealey


Sergei Shtylyov wrote:

Hello.

Sébastien Chrétien wrote:

Hello,
I have a question about Device Tree.
Is Device Tree found only only on Linux Powerpc ?


  Not only Linux as it's a part of Open Firmware which is also used at 
least on SPARC.


The Toshiba TOPAS910 ARM development board also runs Open Firmware and
contains patches to support OF device trees.

I dare say there might be an x86 box or two out there, too. But they
have ACPI tables too which is far more common..

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[PATCH v2] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread Martyn Welch
Support for the SBC610 VPX Single Board Computer from GE Fanuc (PowerPC 
MPC8641D).

A number of MPC8641D based route interrupts for on-board interrupts through
a FPGA based interrupt controller, which is chained with the
MPC8641D's mpic. This patch provides a basic driver to allow basic routing
of interrupts to the mpic.

Signed-off-by: Martyn Welch [EMAIL PROTECTED]
---

Kumar: Thank you for you fast response.

Change for version 2:
 * Driver moved from sysdev to platform/86xx

 arch/powerpc/boot/dts/gef_sbc610.dts |   38 
 arch/powerpc/platforms/86xx/Makefile |2 
 arch/powerpc/platforms/86xx/gef_pic.c|  258 ++
 arch/powerpc/platforms/86xx/gef_pic.h|   11 +
 arch/powerpc/platforms/86xx/gef_sbc610.c |   25 +++
 5 files changed, 328 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/boot/dts/gef_sbc610.dts 
b/arch/powerpc/boot/dts/gef_sbc610.dts
index 80b79e4..d7a591b 100644
--- a/arch/powerpc/boot/dts/gef_sbc610.dts
+++ b/arch/powerpc/boot/dts/gef_sbc610.dts
@@ -67,6 +67,36 @@
reg = 0x0 0x4000; // set by uboot
};
 
+   [EMAIL PROTECTED] {
+   #address-cells = 2;
+   #size-cells = 1;
+   compatible = fsl,mpc8641-localbus, simple-bus;
+   reg = 0xf8005000 0x1000;
+   interrupts = 19 2;
+   interrupt-parent = mpic;
+
+   ranges = 0 0 0xff00 0x0100 // 16MB Boot flash
+ 1 0 0xe800 0x0800 // Paged Flash 0
+ 2 0 0xe000 0x0800 // Paged Flash 1
+ 3 0 0xfc10 0x0002 // NVRAM
+ 4 0 0xfc00 0x8000 // FPGA
+ 5 0 0xfc008000 0x8000 // AFIX FPGA
+ 6 0 0xfd00 0x0080 // IO FPGA (8-bit)
+ 7 0 0xfd80 0x0080;   // IO FPGA (32-bit)
+
+   gef_pic: [EMAIL PROTECTED],4000 {
+   #interrupt-cells = 2;
+   interrupt-controller;
+   device_type = interrupt-controller;
+   compatible = gef,fpga-pic;
+   reg = 0x4 0x4000 0x20;
+   interrupts = 0x8 0x1
+ 0x9 0x1;
+   interrupt-parent = mpic;
+
+   };
+   };
+
[EMAIL PROTECTED] {
#address-cells = 1;
#size-cells = 1;
@@ -150,13 +180,13 @@
reg = 0x24520 0x20;
 
phy0: [EMAIL PROTECTED] {
-   interrupt-parent = mpic;
-   interrupts = 0x0 0x1;
+   interrupt-parent = gef_pic;
+   interrupts = 0x9 0x4;
reg = 1;
};
phy2: [EMAIL PROTECTED] {
-   interrupt-parent = mpic;
-   interrupts = 0x0 0x1;
+   interrupt-parent = gef_pic;
+   interrupts = 0x8 0x4;
reg = 3;
};
};
diff --git a/arch/powerpc/platforms/86xx/Makefile 
b/arch/powerpc/platforms/86xx/Makefile
index cb9fc8f..4a56ff6 100644
--- a/arch/powerpc/platforms/86xx/Makefile
+++ b/arch/powerpc/platforms/86xx/Makefile
@@ -7,4 +7,4 @@ obj-$(CONFIG_SMP)   += mpc86xx_smp.o
 obj-$(CONFIG_MPC8641_HPCN) += mpc86xx_hpcn.o
 obj-$(CONFIG_SBC8641D) += sbc8641d.o
 obj-$(CONFIG_MPC8610_HPCD) += mpc8610_hpcd.o
-obj-$(CONFIG_GEF_SBC610)   += gef_sbc610.o
+obj-$(CONFIG_GEF_SBC610)   += gef_sbc610.o gef_pic.o
diff --git a/arch/powerpc/platforms/86xx/gef_pic.c 
b/arch/powerpc/platforms/86xx/gef_pic.c
new file mode 100644
index 000..50d0a2b
--- /dev/null
+++ b/arch/powerpc/platforms/86xx/gef_pic.c
@@ -0,0 +1,258 @@
+/*
+ * Interrupt handling for GE Fanuc's FPGA based PIC
+ *
+ * Author: Martyn Welch [EMAIL PROTECTED]
+ *
+ * 2008 (c) GE Fanuc Intelligent Platforms Embedded Systems, Inc.
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2.  This program is licensed as is without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#include linux/stddef.h
+#include linux/kernel.h
+#include linux/init.h
+#include linux/irq.h
+#include linux/interrupt.h
+#include linux/spinlock.h
+
+#include asm/byteorder.h
+#include asm/io.h
+#include asm/prom.h
+#include asm/irq.h
+
+#include gef_pic.h
+
+#define DEBUG
+#undef DEBUG
+
+#ifdef DEBUG
+#define DBG(fmt...) do { printk(KERN_DEBUG gef_pic:  fmt); } while (0)
+#else
+#define DBG(fmt...) do { } while (0)
+#endif
+
+#define GEF_PIC_NUM_IRQS   32
+
+/* Interrupt Controller Interface Registers */
+#define GEF_PIC_INTR_STATUS0x
+
+#define 

Re: [PATCH] properly reserve in bootmem the lmb reserved regions that cross numa nodes

2008-09-30 Thread Adam Litke
This seems like the right approach to me.  I have pointed out a few
stylistic issues below.

On Tue, 2008-09-30 at 09:53 -0500, Jon Tollefson wrote:
snip
 + /* Mark reserved regions */
 + for (i = 0; i  lmb.reserved.cnt; i++) {
 + unsigned long physbase = lmb.reserved.region[i].base;
 + unsigned long size = lmb.reserved.region[i].size;
 + unsigned long start_pfn = physbase  PAGE_SHIFT;
 + unsigned long end_pfn = ((physbase+size-1)  PAGE_SHIFT);

CodingStyle dictates that this should be:
unsigned long end_pfn = ((physbase + size - 1)  PAGE_SHIFT);

snip

 +/**
 + * get_node_active_region - Return active region containing start_pfn
 + * @start_pfn The page to return the region for.
 + *
 + * It will return NULL if active region is not found.
 + */
 +struct node_active_region *get_node_active_region(
 + unsigned long start_pfn)

Bad style.  I think the convention would be to write it like this:

struct node_active_region *
get_node_active_region(unsigned long start_pfn)

 +{
 + int i;
 + for (i = 0; i  nr_nodemap_entries; i++) {
 + unsigned long node_start_pfn = early_node_map[i].start_pfn;
 + unsigned long node_end_pfn = early_node_map[i].end_pfn;
 +
 + if (node_start_pfn = start_pfn  node_end_pfn  start_pfn)
 + return early_node_map[i];
 + }
 + return NULL;
 +}

Since this is using the early_node_map[], should we mark the function
__mminit?  

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: New dma-noncoherent code, looking for comment and people to test

2008-09-30 Thread Remi Machet
On Tue, 2008-09-30 at 17:21 +1000, Benjamin Herrenschmidt wrote:
 On Mon, 2008-09-29 at 10:26 -0700, Remi Machet wrote:
  
  I also removed the HIGHMEM support in dma_sync since memory allocated for
  DMA transfer should always be in ZONE_DMA (ie not in ZONE_HIGHMEM).
 
 While I like the idea of simplifying that stuff, the above sentence is
 incorrect unfortunately.
 
 ZONE_DMA is an artifact of x86 ISA DMA limitations. You -will- get
 request for mapping pages for DMA that have been allocated within
 different zones (notably highmem).
 
 The problem with highmem is that whether you can or not DMA to/from
 highmem is somewhat unclear, drivers set flags individually in various
 layers to allow it, which is definitely not the right place to do so. So
 while it would be nice to think we never will, in practice, we do.
 

Yes, I realized that looking at the changes Becky's made. I will put
back the highmem support when merging my changes with those.

Thanks!

Remi

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Disabling L1 D-cache and side effects

2008-09-30 Thread Tirumala Reddy Marri
 
Ben,
  Thanks for the response. I am wondering how user space would get
affected by absence of L1 Dcache.
Thanks,
Marri

-Original Message-
From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 30, 2008 12:16 AM
To: Tirumala Reddy Marri
Cc: Olof Johansson; linuxppc-dev@ozlabs.org
Subject: RE: Disabling L1 D-cache and side effects

On Mon, 2008-09-29 at 14:38 -0700, Tirumala Reddy Marri wrote:
 Could you please point me to the which does the Critical error 
 (Machine
 Check) recovery. BTW I am successful booting the Linux until rootfs is

 being mounted. It fails to mount the Linux saying that blocks are 
 corrupted in file system. I had to modify lots of initial bring up 
 code to disable D-cache and make sure all TLB's are cache inhibited. 
 Ando also made sure none of the misc_32.S , entry_32.S and head.S 
 makes any references to d-cache.

Why the heck are you doing that btw ? AFAIK, as Olof says, things like
atomic operations will not work, dcbz neither etc... it's likely that
even if you manage to plaster around all of this in the kernel, whatever
userspace code you'll try to run in userspace will blow up too...

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread Scott Wood
On Tue, Sep 30, 2008 at 03:29:42PM +0100, Martyn Welch wrote:
 + gef_pic: [EMAIL PROTECTED],4000 {
 + #interrupt-cells = 2;

What is the second interrupt cell for, given that all interrupts are
level-triggered and you don't implement .set_type?

-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] powerpc/boot: add kernel,end node to the cuboot target

2008-09-30 Thread Sebastian Siewior

Milton Miller wrote:


|load: entry = 0x80053c flags = 0
|nr_segments = 2
|segment[0].buf   = 0x1002b8f0
|segment[0].bufsz = 80
|segment[0].mem   = (nil)
|segment[0].memsz = 1000
|segment[1].buf   = 0x4803f008
|segment[1].bufsz = 3a3138
|segment[1].mem   = 0x80
|segment[1].memsz = 3b


I would expect a third segment (kernel/zImage, dtb, and purgatory), but 
its not clear that you are getting that far yet.


segment 0 looks like a small segment which should create boot loader 
environment. That one does nothing.

Segment 1 is my cuImage. What is purgatory?


Now. The entry address in image-start is valid and is the entrypoint of
the custom cuImage. Custom means that it does not depend any register
values passed from u-boot (the original one needs a pointer to bd_t).
The only requirement is a valid 1:1 memory mapping.


ok sounds good.  does this have the dtb in it too?

Yes it does.


The branch above is taken, so I've found my current mapping


ok, but should you not be using PID0 explictly to say global only?
The kernel mapping should only be global and therefore that might be a 
good idea.


obviously, a jtag or similar hardware debugger would be best.  Second 
I have here CodeWarrior usb tap but after more than one hour playing with 
that thing I started to hack assembly char put. It helper more :) kexec 
seems to work now :) I get nobody cared irq X from time to time so I 
thing I have to fix here something.


As a final note, it looks like you are currently replacing the code in 
relocate_new_kernel with book-e code.  Obviously this will need 
refinement to select or move to heat_xx to merge.
Yep, this is next what is going to happen. I would prefer to have them 
runtime switchable instead of build depend.


Again, I don't have any direct experience, but mauybe this gives you 
some ideas.

Your hints helped. Thx for that.


milton

Sebastian
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Remove legacy kdump kernel build support

2008-09-30 Thread Anton Vorontsov
On Tue, Sep 30, 2008 at 06:18:28PM +0530, Mohan Kumar M wrote:
 Remove legacy kdump kernel build support
 
 This patch removes legacy kdump kernel build support(i.e compiling  a
 kdump kernel at a fixed hardcoded address 32MB). With the relocatable
 kernel support its now possible to use the regular kernel binary for
 capturing the dump also. Relocatable kdump kernel does not require
 trampoline code for exception handlers. It removes kdump.h as most of the
 macros defined in kdump.h are not used any more. Also the relocatable
 kdump kernel allows us to load kdump kernel any where as specified by
 crashkernel parameter instead of hardcoded address 32MB.

Can we leave the legacy support for a while? On PPC32 we don't have
the relocatable kernel support, but we have kdump patches floating
around[1], and they use the `hard-coded values' approach, so far.

I'm not sure if anybody is currently working on a PPC32 kernel
relocation support.. but for sure it will take some time to implement.

Thanks,

[1] http://ozlabs.org/pipermail/linuxppc-dev/2008-August/061161.html

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Patches added to 4xx 'next' branch

2008-09-30 Thread Josh Boyer
Hi All,

The following patches have been added to the 4xx 'next' branch

Josh Boyer (3):
  ibm_newemac: Allow the no flow control EMAC feature to work
  ibm_newemac: Introduce mal_has_feature
  ibm_newemac: MAL support for PowerPC 405EZ

Matthias Fuchs (1):
  ppc4xx: Allow 4xx PCI bridge to be disabled via device tree

Victor Gallardo (1):
  ibm_newemac: Add support for GPCS, SGMII and M88E1112 PHY

As usual, they will sit there for a few days and then I'll ask Paul
(or maybe Ben) to pull into the overall powerpc.git tree.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Disabling L1 D-cache and side effects

2008-09-30 Thread Benjamin Herrenschmidt
On Tue, 2008-09-30 at 09:57 -0700, Tirumala Reddy Marri wrote:
 Ben,
   Thanks for the response. I am wondering how user space would get
 affected by absence of L1 Dcache.

You didn't answer my question :-)

Well, as I said, things like lwarx/stwcx not working, dcbz taking
alignment exceptions, etc...

Ben.

 Thanks,
 Marri
 
 -Original Message-
 From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 30, 2008 12:16 AM
 To: Tirumala Reddy Marri
 Cc: Olof Johansson; linuxppc-dev@ozlabs.org
 Subject: RE: Disabling L1 D-cache and side effects
 
 On Mon, 2008-09-29 at 14:38 -0700, Tirumala Reddy Marri wrote:
  Could you please point me to the which does the Critical error 
  (Machine
  Check) recovery. BTW I am successful booting the Linux until rootfs is
 
  being mounted. It fails to mount the Linux saying that blocks are 
  corrupted in file system. I had to modify lots of initial bring up 
  code to disable D-cache and make sure all TLB's are cache inhibited. 
  Ando also made sure none of the misc_32.S , entry_32.S and head.S 
  makes any references to d-cache.
 
 Why the heck are you doing that btw ? AFAIK, as Olof says, things like
 atomic operations will not work, dcbz neither etc... it's likely that
 even if you manage to plaster around all of this in the kernel, whatever
 userspace code you'll try to run in userspace will blow up too...
 
 Cheers,
 Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Disabling L1 D-cache and side effects

2008-09-30 Thread Tirumala Reddy Marri
Ben,
I got to bring up Linux on one of the 440 processors with out L1
dcache to do some bench marking  and compare with  L1 d-cache enabled.  

I am avoiding any references to dcbz ,dcbt and dcbst .   Also the TLB's
are created with cache inhibited. I looked at lwarx/stwcx description,
there seem to be no dependency on L1 cache.

I don't see any critical exceptions or traps. All I  see is /init/bin
failing to execute because data is corrupted. 

Thanks,
Marri


-Original Message-
From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 30, 2008 2:31 PM
To: Tirumala Reddy Marri
Cc: Olof Johansson; linuxppc-dev@ozlabs.org
Subject: RE: Disabling L1 D-cache and side effects

On Tue, 2008-09-30 at 09:57 -0700, Tirumala Reddy Marri wrote:
 Ben,
   Thanks for the response. I am wondering how user space would get 
 affected by absence of L1 Dcache.

You didn't answer my question :-)

Well, as I said, things like lwarx/stwcx not working, dcbz taking
alignment exceptions, etc...

Ben.

 Thanks,
 Marri
 
 -Original Message-
 From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, September 30, 2008 12:16 AM
 To: Tirumala Reddy Marri
 Cc: Olof Johansson; linuxppc-dev@ozlabs.org
 Subject: RE: Disabling L1 D-cache and side effects
 
 On Mon, 2008-09-29 at 14:38 -0700, Tirumala Reddy Marri wrote:
  Could you please point me to the which does the Critical error 
  (Machine
  Check) recovery. BTW I am successful booting the Linux until rootfs 
  is
 
  being mounted. It fails to mount the Linux saying that blocks are 
  corrupted in file system. I had to modify lots of initial bring up 
  code to disable D-cache and make sure all TLB's are cache inhibited.
  Ando also made sure none of the misc_32.S , entry_32.S and head.S 
  makes any references to d-cache.
 
 Why the heck are you doing that btw ? AFAIK, as Olof says, things like

 atomic operations will not work, dcbz neither etc... it's likely that 
 even if you manage to plaster around all of this in the kernel, 
 whatever userspace code you'll try to run in userspace will blow up
too...
 
 Cheers,
 Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Device Tree

2008-09-30 Thread Gerald Van Baren
Matt Sealey wrote:

 Sergei Shtylyov wrote:
 Hello.

 Sébastien Chrétien wrote:
 Hello,
 I have a question about Device Tree.
 Is Device Tree found only only on Linux Powerpc ?

   Not only Linux as it's a part of Open Firmware which is also used at
 least on SPARC.

 The Toshiba TOPAS910 ARM development board also runs Open Firmware and
 contains patches to support OF device trees.

 I dare say there might be an x86 box or two out there, too. But they
 have ACPI tables too which is far more common..

More than a box or two: lots of OLPC XOs out there now.  ;-)
http://wiki.laptop.org/go/Open_Firmware

Best regards,
gvb
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Disabling L1 D-cache and side effects

2008-09-30 Thread Benjamin Herrenschmidt
On Tue, 2008-09-30 at 15:26 -0700, Tirumala Reddy Marri wrote:
 Ben,
 I got to bring up Linux on one of the 440 processors with out L1
 dcache to do some bench marking  and compare with  L1 d-cache enabled.  
 
 I am avoiding any references to dcbz ,dcbt and dcbst .   Also the TLB's
 are created with cache inhibited. I looked at lwarx/stwcx description,
 there seem to be no dependency on L1 cache.

Ok. Well, they are generally implemented at the L2 level but maybe not
on 440, architecturally, they must be used on cacheable memory but it's
possible that 440 being not SMP coherent, the actual implementation of
those is too dumb to care.

 I don't see any critical exceptions or traps. All I  see is /init/bin
 failing to execute because data is corrupted. 

Have you properly replaced dcbz with multiple stores ? I did some bring
up work internally on some stuff where dcbz wasn't quite there yet and
one pitfall to be careful is that if you force-enable the alternate
CONFIG_8xx implementation in the various copy  memset routines in
arch/powerpc/lib, you also need to fix those implementations to copy
or clear 32 bytes instead of just 16, as 8xx has 16 byte cache lines.

Typically failing to do so causes things like memset to fail to properly
clear things such as page tables and thus random crap occurs.

Cheers,
Ben.

 Thanks,
 Marri

 
 -Original Message-
 From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 30, 2008 2:31 PM
 To: Tirumala Reddy Marri
 Cc: Olof Johansson; linuxppc-dev@ozlabs.org
 Subject: RE: Disabling L1 D-cache and side effects
 
 On Tue, 2008-09-30 at 09:57 -0700, Tirumala Reddy Marri wrote:
  Ben,
Thanks for the response. I am wondering how user space would get 
  affected by absence of L1 Dcache.
 
 You didn't answer my question :-)
 
 Well, as I said, things like lwarx/stwcx not working, dcbz taking
 alignment exceptions, etc...
 
 Ben.
 
  Thanks,
  Marri
  
  -Original Message-
  From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, September 30, 2008 12:16 AM
  To: Tirumala Reddy Marri
  Cc: Olof Johansson; linuxppc-dev@ozlabs.org
  Subject: RE: Disabling L1 D-cache and side effects
  
  On Mon, 2008-09-29 at 14:38 -0700, Tirumala Reddy Marri wrote:
   Could you please point me to the which does the Critical error 
   (Machine
   Check) recovery. BTW I am successful booting the Linux until rootfs 
   is
  
   being mounted. It fails to mount the Linux saying that blocks are 
   corrupted in file system. I had to modify lots of initial bring up 
   code to disable D-cache and make sure all TLB's are cache inhibited.
   Ando also made sure none of the misc_32.S , entry_32.S and head.S 
   makes any references to d-cache.
  
  Why the heck are you doing that btw ? AFAIK, as Olof says, things like
 
  atomic operations will not work, dcbz neither etc... it's likely that 
  even if you manage to plaster around all of this in the kernel, 
  whatever userspace code you'll try to run in userspace will blow up
 too...
  
  Cheers,
  Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Device Tree

2008-09-30 Thread Matt Sealey


Gerald Van Baren wrote:

Matt Sealey wrote:



The Toshiba TOPAS910 ARM development board also runs Open Firmware and
contains patches to support OF device trees.

I dare say there might be an x86 box or two out there, too. But they
have ACPI tables too which is far more common..


More than a box or two: lots of OLPC XOs out there now.  ;-)
http://wiki.laptop.org/go/Open_Firmware


I was thinking more in a platform kind of numbers rather than a sales
kind of numbers, but you're right.

It's far more common than people might think at first glance. With x86
I am sure it would benefit the platform a little more if the OF support
was in-line with the shared code between PPC and SPARC (and now I guess,
ARM) but nevertheless it's an Open Firmware platform and something that
appeared not too long ago.

OF is still a going concern; if you want a nice flexible firmware, why
not use it? Most of the implementations are open source (FirmWorks and
CodeGen trees, and the Sun reference design) too.

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: GE Fanuc's FPGA based PIC controller on the SBC610

2008-09-30 Thread David Gibson
On Tue, Sep 30, 2008 at 04:34:56PM +0100, Martyn Welch wrote:
 Support for the SBC610 VPX Single Board Computer from GE Fanuc (PowerPC 
 MPC8641D).
 
 A number of MPC8641D based route interrupts for on-board interrupts through
 a FPGA based interrupt controller, which is chained with the
 MPC8641D's mpic. This patch provides a basic driver to allow basic routing
 of interrupts to the mpic.
 
 Signed-off-by: Martyn Welch [EMAIL PROTECTED]
 ---
 
 Kumar: Thank you for you fast response.
 
 Change for version 2:
  * Driver moved from sysdev to platform/86xx
 
  arch/powerpc/boot/dts/gef_sbc610.dts |   38 
  arch/powerpc/platforms/86xx/Makefile |2 
  arch/powerpc/platforms/86xx/gef_pic.c|  258 
 ++
  arch/powerpc/platforms/86xx/gef_pic.h|   11 +
  arch/powerpc/platforms/86xx/gef_sbc610.c |   25 +++
  5 files changed, 328 insertions(+), 6 deletions(-)
 
 diff --git a/arch/powerpc/boot/dts/gef_sbc610.dts 
 b/arch/powerpc/boot/dts/gef_sbc610.dts
 index 80b79e4..d7a591b 100644
 --- a/arch/powerpc/boot/dts/gef_sbc610.dts
 +++ b/arch/powerpc/boot/dts/gef_sbc610.dts
 @@ -67,6 +67,36 @@
   reg = 0x0 0x4000; // set by uboot
   };
  
 + [EMAIL PROTECTED] {
 + #address-cells = 2;
 + #size-cells = 1;
 + compatible = fsl,mpc8641-localbus, simple-bus;
 + reg = 0xf8005000 0x1000;
 + interrupts = 19 2;
 + interrupt-parent = mpic;
 +
 + ranges = 0 0 0xff00 0x0100 // 16MB Boot flash
 +   1 0 0xe800 0x0800 // Paged Flash 0
 +   2 0 0xe000 0x0800 // Paged Flash 1
 +   3 0 0xfc10 0x0002 // NVRAM
 +   4 0 0xfc00 0x8000 // FPGA
 +   5 0 0xfc008000 0x8000 // AFIX FPGA
 +   6 0 0xfd00 0x0080 // IO FPGA (8-bit)
 +   7 0 0xfd80 0x0080;   // IO FPGA (32-bit)
 +
 + gef_pic: [EMAIL PROTECTED],4000 {
 + #interrupt-cells = 2;
 + interrupt-controller;
 + device_type = interrupt-controller;

I don't think you should need this device_type.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Device Tree

2008-09-30 Thread Benjamin Herrenschmidt
On Tue, 2008-09-30 at 18:08 -0500, Matt Sealey wrote:
 It's far more common than people might think at first glance. With x86
 I am sure it would benefit the platform a little more if the OF support
 was in-line with the shared code between PPC and SPARC (and now I guess,
 ARM) but nevertheless it's an Open Firmware platform and something that
 appeared not too long ago.
 
 OF is still a going concern; if you want a nice flexible firmware, why
 not use it? Most of the implementations are open source (FirmWorks and
 CodeGen trees, and the Sun reference design) too.

And SLOF :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: USB support on mpc5200 broken

2008-09-30 Thread Benjamin Herrenschmidt

 This is what we were recommended to use at the time. There is a patch
 on www.powerdeveloper.org which tweaks the tree to make it ultra-compliant
 with the Linux version of things, which implements every variation. It
 also implements a suggested patch which added a big-endian property
 (not built in to the compatible property, but another property).
 
 I don't see why THAT patch got reverted as it was a great idea that we
 all agreed was a great idea.

I agree. Something needs to be fixed on the OHCI OF stuff, it should
definitely cope with the big-endian property (which is a practice
borrowed from Apple that I recommended I think back then) and I don't
see any problem with having ohci-be in the compatible property, its
trivial enough to cope in the driver and being anal about it on the
kernel side doesn't really bring any benefit.

Care to send a patch ?

 Linux development around here is getting really schizophrenic. Nobody
 is writing these decisions down even as comments in the source code..

That isn't entirely true. There's the ePAPR effort on power.org that is
codifying a lot of that, and there are binding documents dropped in
Documentation/powerpc.

 No; you can have little endian OHCI controllers on big endian machines.
 It's a property of the host controller, not the system architecture, just
 like PCI is always little endian (except when you have magic in hardware
 like Amiga PowerUP cards which endianswap for you :)

In fact, you can have both kinds on the same machine.

Note about the Amiga stuff: it's a bad idea :-) Every attempt at
magically fixing endian in HW is a recipe for tears and disasters.
Approximately ... always. The only cases that I know that have a remote
chance of being useful are specifically programmable swappers on a given
device or per-page endian configuration in the processor (like BooKE).

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] powerpc/boot: add kernel,end node to the cuboot target

2008-09-30 Thread Milton Miller

On Sep 30, 2008, at 12:21 PM, Sebastian Siewior wrote:

Milton Miller wrote:

|load: entry = 0x80053c flags = 0
|nr_segments = 2
|segment[0].buf   = 0x1002b8f0
|segment[0].bufsz = 80
|segment[0].mem   = (nil)
|segment[0].memsz = 1000
|segment[1].buf   = 0x4803f008
|segment[1].bufsz = 3a3138
|segment[1].mem   = 0x80
|segment[1].memsz = 3b
I would expect a third segment (kernel/zImage, dtb, and purgatory), 
but its not clear that you are getting that far yet.


segment 0 looks like a small segment which should create boot loader 
environment. That one does nothing.

Segment 1 is my cuImage. What is purgatory?


Purgatory is the code that runs between the old kernel exiting and the 
new image loading.   Its supposed to be where any registers, dynamic 
memory structures, etc get set before calling the image supplied to 
kexec user space.   Its built as part of the kexec-tools suite as a 
completley relocatable elf and selected and edited based on type of 
image being loaded.  For powerpc64 it is where we take the boot / 
master cpu's physical id from r3, put it in the dtb header, and load 
the address of r3 with the dtb before going into the kernel (for 
vmlinux, and could do for zImage but don't have support upstream).  If 
you were booting a cuImage (as opossed to the code you are aparently 
running, which is what grant called simple image, effectively), then 
you would set any registers uboot leaves behind in this code.


The standard code supplied by kexec-tools also calculates a checkum 
(sha1) of each loaded segment (except itself) and checks that vs the 
sum calculated by kexec-tools userspace (printing a message that on 
powerpc has no way to be displayed then going into an infinite 
spinloop.  Oh well, I digress.) and also where, for kdump, any memory 
backup copy is performed when a specific memory segment is needed to 
boot (eg initial page for ppc64 and classic32 that require interrupt 
(exception) vectors to be in page 0-2).


The powerpc64 code reads the existing device tree from 
/proc/device-tree and modifies a few things (initrd start, end, 
bootargs = command line, and (for kdump) which memory is available and 
usable to the kernel (vs reserved because it was used for the old 
kernel, whose image we want to dump, and which could be under dma).



Now. The entry address in image-start is valid and is the 
entrypoint of
the custom cuImage. Custom means that it does not depend any 
register

values passed from u-boot (the original one needs a pointer to bd_t).
The only requirement is a valid 1:1 memory mapping.

ok sounds good.  does this have the dtb in it too?

Yes it does.


ok.   sounds like a simple image then ... ok to start with, but 
eventually we want to dtb passed via the tool so we can set command 
line etc.


I actually developed the powerpc 64 code this way to, and let someone 
else make the standard tool work.  But the standard tool is useful.



The branch above is taken, so I've found my current mapping

ok, but should you not be using PID0 explictly to say global only?
The kernel mapping should only be global and therefore that might be a 
good idea.



obviously, a jtag or similar hardware debugger would be best.  Second
I have here CodeWarrior usb tap but after more than one hour playing 
with that thing I started to hack assembly char put. It helper more :) 
kexec seems to work now :) I get nobody cared irq X from time to 
time so I thing I have to fix here something.


kexec is a bit harder than kdump in that you have to make sure all 
devices have shutdown handlers.   Easier for those that are modules 
that can be loaded and unloaded (make sure they have a shutdown method 
that is comparable to unload, or even unload in a script to test).   
kdump is harder in that while the dma is left running in the old 
kernel, the new kernel has to fit in the cracks left over, and has to 
initialize devices that were not shutdown.


As a final note, it looks like you are currently replacing the code 
in relocate_new_kernel with book-e code.  Obviously this will need 
refinement to select or move to heat_xx to merge.
Yep, this is next what is going to happen. I would prefer to have them 
runtime switchable instead of build depend.


well, I am thinking that we will end up with one exit condition for all 
book-e, one for classic 32, and one for powerpc64.   I don't understand 
what you think should be runtime switchable, unless you were thinking 
about code that should be in purgatory (supplied by userspace as far as 
the kernel is concerned).


Remember the exit point of the kernel is a single entry point (we cheat 
and make it 2 on powerpc64, one for master and a second for slaves, 
although for book-e we could follow epapr instead), and specified pages 
of memory with user specified content.  The state is supposed to be an 
emulation of mmu off, not I just ran uboot and am its client 
loader.


Again, I don't have any direct experience, but mauybe this gives you 
some