Re: Maple PPC970 kexec crash-dump problems

Milton Miller Fri, 06 Feb 2009 08:58:25 -0800


On Feb 4, 2009, at 12:48 PM, Benjamin Walsh wrote:

Hi Milton,
I've tracked it down to the device tree passed to the second kernelbeing screwed-up when patched by kexec-tools. Namely, it was creatinglinux,usable-memory entries that were wrong, and the MMUinitialization hung when it failed allocating for the page tables. Ihacked the tool, and got passed that point in the init sequence, butthe very first IO mapped access fails, so the MMU doesn't seem to beset up correctly.


I would need more details exactly what you think is wrong.

How does it fail?

If the first IO mapped access fails, then I would ask if you are usingIOMMU. It is quite possible that the dart iommu code needs to bemodified to use the existing mapping table instead of allocating a newtable, otherwise any existing mappings being used by inflight dma wouldfail and the that might cause mmio loads to wait for uncompletable dmawrites. Just a theory with the lack of information you gave me.

Anyway, up to my question: is the crash dump (kdump) kernel supposedto use the memory reserved for it by the first kernel for its workingmemory ? e.g. On that board, I have 0->2GB and 4->6GB for a total of4GB of RAM. Let's say I reserve 1...@32m, that's 0x2000000->0xa000000.Is the second kernel supposed to use
(0x2000000+<kernel size>) -> 0xa000000

for its memory pool and leave everything else:

0->0x2000000, 0xa000000 -> 80000000, 0x100000000 -> 0x180000000

as memory that is from the first kernel, used to debug it ?



Yes, but that is not quite how the device tree is formed.

The second kernel will also use the interrupt vector area at address 0.Therefore that is saved as the backup region in purgatory to theaddress allocated in the kdump region. The device tree is thencreated with linux,usable-memory regions extending the kdump regionback to 0, and a reserve entry marking the area as reserved.

In addition, the device tree gets the memory backing tce tables forpseries smp mode. It may need the page with the dart table, markedreserved, so that the table gets added to the linear map -- except itshould be mapped cache inhibited so that may not work either.

Basically, I am trying to figure out if I patched the tool correctly.

Thanks,
Ben

On Sat, Jan 24, 2009 at 2:52 AM, Milton Miller <milt...@bga.com> wrote:
On Sat Jan 24 at 07:59:47 EST in 2009, Benjamin Walsh wrote:
I am trying to use kexec with a crash dump kernel on a Maple board(MotorolaATCA6101 to be precise). This board is running a two-CPU PPC970FX.I amrunning a 2.6.27-10 kernel and have tried both older kexec-toolsand the
 newest ones. I have tried SMP and non-SMP kernels.
Once you start the second cpu it is likly executing instructionssomewhere.
Priory to 2.6.27 you had to compile a fixxed offset kerenl to runkdump. With 2.6.27 that option was removed and replaced with tehrelocatable kerenl. However, becasue of the way linux interacts withopen firmware, the kernel will still move itself to 0 unless aspecific flag is set. The location of the flag was changed twiceduring the merge process, and the patches for kexec-tools were notmade until early this year.
Using kexec -l to fast boot works correctly. However, loading acrash dumpkernel and triggering a crash via echo c > /proc/sysrq-triggersimply hangs
 the board. I have traced the sequence down to after the call to
 kexec_copy_flush(), when the CPU returns to real-address mode (bl
 real_mode). At this point I have no further debugging information.
Two things could help me:
- Getting the fix if this is a known issue and a fix exists. I havelookedat recent patches and nothing lept to mind, mostly relocatablekernel
 support.
 That is a major change.
That said, I don't know if anyone has tested kexec panic beyondpseries for 64 bit powerpc.
I know Paul originally prototyped the relocatable patch on apowermac, but I dont' know what if any smp testing he performed. And you said you are actualy on maple not a powermac, so the startupissues are different.
- Obtaining the address of the serial port @3f8 in real mode. Theinit
 sequence with udbg ON says that the physical address of the port is
 0xf40003f8; however, setting it up in poll mode and trying to stuff
 characters in the tx buffer doesn't produce anything.
Ah yes. In real mode you can only talk to cacheable memory withoutimplementation specific assistance. However, if you look in thekernel for the maple early udbg support, you will find the code youneed to talk to that serial port in real mode.
 Has anyone recently tried to use the serial port in real mode ?

 Thanks for any help.

 Ben
Hope this gets you started. I wrote a lot of the kernel code, but Ihad the advantage of external jtag access to the processor to seewhere it when ended up when it went astray.
 milton


_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: Maple PPC970 kexec crash-dump problems

Reply via email to