Re: How to go about debuging a system lockup?

2006-11-17 Thread Krzysztof Halasa
"Jesper Juhl" <[EMAIL PROTECTED]> writes: > Or just try a few random older 2.6 kernels like 2.6.14, 2.6.9, > 2.6.whatever (of course it needs to be a version that git knows > about). One can also do "bisect" manually, works with all kernels. -- Krzysztof Halasa - To unsubscribe from this list: s

Re: How to go about debuging a system lockup?

2006-11-17 Thread Stefan Richter
Lennart Sorensen wrote: > OK, I have now tried connecting with firescope to just follow the dmesg > buffer across firewire. Works great, until the system hangs, then > firescope reports that it couldn't perform the read. I wonder what part > of the system has to lock up for the firewire card to n

Re: How to go about debuging a system lockup?

2006-11-17 Thread Lennart Sorensen
On Fri, Nov 17, 2006 at 09:29:28AM -0500, Lennart Sorensen wrote: > Wow, that looks really neat. I will have to go read up on that tool. OK, I have now tried connecting with firescope to just follow the dmesg buffer across firewire. Works great, until the system hangs, then firescope reports tha

Re: How to go about debuging a system lockup?

2006-11-17 Thread Lennart Sorensen
On Fri, Nov 17, 2006 at 02:43:36PM +0100, Stefan Richter wrote: > If the PCI bus itself isn't brought down, you could debug from remote > using Benjamin Herrenschmidt's Firescope on the remote node and a > FireWire card in the test machine. Once the ohci1394 driver was loaded, > the FireWire contro

Re: How to go about debuging a system lockup?

2006-11-17 Thread Stefan Richter
Lennart Sorensen wrote: > On Thu, Nov 16, 2006 at 04:01:03PM -0600, Protasevich, Natalie wrote: >> There are some port 80 cards that you can buy: ... > Hmm, one of those on the PCI bus might work. Or perhaps the parallel > port will. Of course if the problem is that somehow the PCI bus is > locke

Re: How to go about debuging a system lockup?

2006-11-16 Thread Lennart Sorensen
On Thu, Nov 16, 2006 at 04:01:03PM -0600, Protasevich, Natalie wrote: > If you can't drop in kdb, or no sysreq, then your interrupts are > disabled. I used to be (with older systems anyway) that NMI button was > on the system, so one could send an NMI and make the handler to print a > trace. Newer

RE: How to go about debuging a system lockup?

2006-11-16 Thread Protasevich, Natalie
> I don't know of a good version yet. I so far don't know if there ever > was one. This could even be a bug in the PCI hardware, or the way the > BIOS on this system on a board configured the PCI controller. Maybe I > should go back and try a 2.4 kernel. > > > Hope some of that helps :) > > We

Re: How to go about debuging a system lockup?

2006-11-16 Thread Jesper Juhl
On 16/11/06, Lennart Sorensen <[EMAIL PROTECTED]> wrote: On Thu, Nov 16, 2006 at 09:49:06PM +0100, Jesper Juhl wrote: ... > - You could also try kdb (http://oss.sgi.com/projects/kdb/) or kgdb > (http://kgdb.linsyssoft.com/). That might help you pinpoint the > failure. Can I run that remotely s

Re: How to go about debuging a system lockup?

2006-11-16 Thread Lennart Sorensen
On Thu, Nov 16, 2006 at 09:49:06PM +0100, Jesper Juhl wrote: > Well, I have a few ideas that are hopefully useul. > > - If you have not done so already, then go in to the "Kernel Hacking" > section of the kernel configuration and enable some (all?) of the > debug options and see if that produces a

Re: How to go about debuging a system lockup?

2006-11-16 Thread Jesper Juhl
On 16/11/06, Lennart Sorensen <[EMAIL PROTECTED]> wrote: We have a router with a Geode SC1200 cpu, with 4 AMD 972 ethernet ports (pcnet32) behind a PLX 6152 PCI-PCI bridge, which quite regularly locks up completely if we try to do simultanius traffic on all 4 ports (our test case sends data from