>>> Hi, >>> Im having problem porting linux kernel 2.4.21 to our mpc852T custom >>> board.The kernel >>> panics randomly with sig 11. >>> The board boots up fine and we also get to the prompt.When we open 3-4 >>> telnet sessions >>> and try to run some command the kernel panics.This is completely >>> random.Sometimes it >>> even panics before opening the telnet session. >>>
>>> <oops dump snipped> >>> >>You almost certainly have SDRAM problems. If you have thoroughly checked >>out the >>complete address range statically, remember that burst accesses will not >>occur until the >>cache is turned on, so your problem may be with bursting. But you can also >>have severe >>problems like a missing address line and linux still run for a few seconds. >> >>Mark Chambers >We've checked the SDRAM. The timings (UPM) look fine. The problem >however is that linux does not hang until after a few processes are >started. >If we boot to linux and leave it as it is, everything is fine and the >board remains working. However each time a few processes (4-5 telnet >sessions for eg.) are started the system either panics or hangs (goes >dead). >Thanks in advance, >Akshay We have been experiencing this same issue with random boards in production. The exact same version of software will run for months on other instances of the exact same board design, but a few percent get 'random' trap 300s. When they do occur, it's only after Linux has booted and address translation and caching are turned on. Examining the oops-es and memory shows that some location in SDRAM has a bogus value, but I don't have the tools to trace back how it got that way. I have ported a rigorous moving-inversions memory test into our firmware, and have run it extensively across the entire SDRAM address space (the test code executes from flash). I have let this test run continuously for hours and hours, but never found a memory problem. Unfortunately, I do not have test software that enables the MMU address translation or caching, so as Mark said, I can't test memory using bursting. Our hardware engineers have reviewed the designs very carefully and are quite confident that there is plenty of margin in the memory timing. Signal quality has also been carefully checked. Our manufacturing people have replaced the CPU on some of these boards, and the problem went away. If anyone else on the mailing list has experienced this issue, or has developed a virtual address memory test, please let us know. Ken Poole -------------- next part -------------- An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20060419/8f7f5351/attachment.htm