Re: How do I debug kernel panic that occurs while running X?

2012-03-04 Thread Jason Heeris
On 4 March 2012 01:28, Brendon Higgins blhigg...@gmail.com wrote:
 Any more ideas? As I said, I tried getting kdump working but have been having
 trouble getting it to behave.

One more thought, but it's a bit of a long shot as to whether you have
the equipment. The most watertight way I know of to capture kernel
output is a serial port and another computer. If, by any chance, you
have one on your machine, edit /etc/defaults/grub to include
console=ttyS0,155200n1 (or whatever speed you like) on the kernel boot
line. You'll also need another machine with either a serial port or a
USB-serial adapter, making it half as likely that this will help you
:P

Of course, most computers these days (*ahem*) don't have serial ports.
*Maybe* a USB-serial adapter will work for the target machine too...
although this requires an extra level of redirection on the part of
the kernel, and may not be as foolproof, so I wouldn't spend the money
if you don't already have one (or, in fact, two).

Other than that, I'm out of ideas. Hopefully someone else on this list
has an idea that doesn't require technology that was rendered obsolete
for most people by 1995...

— Jason


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CA+Zd3Fc6Y7SjN_RafobOV4NJ819wR=6kjg0jaf2kxzbysfg...@mail.gmail.com



Re: How do I debug kernel panic that occurs while running X?

2012-03-04 Thread Sven Joachim
On 2012-03-04 09:16 +0100, Jason Heeris wrote:

 On 4 March 2012 01:28, Brendon Higgins blhigg...@gmail.com wrote:
 Any more ideas? As I said, I tried getting kdump working but have been having
 trouble getting it to behave.

 One more thought, but it's a bit of a long shot as to whether you have
 the equipment. The most watertight way I know of to capture kernel
 output is a serial port and another computer. If, by any chance, you
 have one on your machine, edit /etc/defaults/grub to include
 console=ttyS0,155200n1 (or whatever speed you like) on the kernel boot
 line. You'll also need another machine with either a serial port or a
 USB-serial adapter, making it half as likely that this will help you
 :P

 Of course, most computers these days (*ahem*) don't have serial ports.
 *Maybe* a USB-serial adapter will work for the target machine too...
 although this requires an extra level of redirection on the part of
 the kernel, and may not be as foolproof, so I wouldn't spend the money
 if you don't already have one (or, in fact, two).

 Other than that, I'm out of ideas. Hopefully someone else on this list
 has an idea that doesn't require technology that was rendered obsolete
 for most people by 1995...

Either log in via ssh if that is still possible, or use netconsole to
capture kernel messages.

Sven


http://www.kernel.org/doc/Documentation/networking/netconsole.txt
http://blog.mraw.org/2010/11/08/Debugging_using_netconsole/


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87eht84p3v@turtle.gmx.de



Re: How do I debug kernel panic that occurs while running X?

2012-03-04 Thread Brendon Higgins
Hi,

Jason Heeris wrote (Sun March 4, 2012):
 The most watertight way I know of to capture kernel
 output is a serial port and another computer. If, by any chance, you
 have one on your machine

Not really an option (at least, not an easy one), I'm afraid. I only have two 
machines, and neither of them have serial ports. Rendered obsolete, indeed.

Sven Joachim wrote (Sun March 4, 2012):
 Either log in via ssh if that is still possible,

The computer has gone well beyond the point of responding to anything coming 
in over the network by the time it has frozen.

 or use netconsole to capture kernel messages.

Thanks for the links. This looked promising, but I cannot get it to function. 
I can get the problem machine connected to a netbook successfully (I can talk 
between them over UDP using netcat). However, netconsole refuses to transmit 
any messages. I install the module just as the article you linked to suggests, 
and it appears to be correct when I see it logged in dmesg:
[103337.293616] netconsole: local port 6665
[103337.293626] netconsole: local IP 0.0.0.0
[103337.293632] netconsole: interface 'eth0'
[103337.293637] netconsole: remote port 
[103337.293642] netconsole: remote IP 192.168.0.1
[103337.293646] netconsole: remote ethernet address ff:ff:ff:ff:ff:ff
[103337.293654] netconsole: local IP 192.168.0.2
[103337.293754] console [netcon0] enabled
[103337.293762] netconsole: network logging started
But this doesn't seem to work - there's just nothing transmitted. I can't even 
get it to send messages to a netcat listener on the same machine.

Have you (or anyone) found this approach works? Is there something I'm 
missing?

Thanks for your help so far.

Peace,
Brendon


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201203041717.11587.blhigg...@gmail.com



Re: How do I debug kernel panic that occurs while running X?

2012-03-03 Thread Brendon Higgins
Hi again,

Charles Krinke wrote (Thu March 1, 2012):
 On the next boot, /var/log/messages shoild contain the last printk's from
 the kernel which would include any panic.

Thanks. I'd already checked there, though, and no dice. The log just skips 
from the last innocuous kernel message to messages about the next boot. 
Nothing about what caused the reboot to be necessary.

Jason Heeris wrote (Fri March 2, 2012):
 I've had problems with write caching causing the last few messages to
 be lost after a panic*, so if you don't see anything suspicious, maybe
 turn off write caching with 'hdparm -W 0 /dev/whatever' for long
 enough to reproduce the crash. Just in case.

Thanks for the tip. I gave that a try using the manual invocation of a kernel 
panic (i.e., echo c  /proc/sysrq-trigger) but I still get nothing in 
/var/log/messages about this event. So, I'm doubtful this'll work when a real 
panic strikes, unfortunately.

FWIW, at the moment I'm running the kernel in testing, but this has been a 
problem since back about 2.6.38 or 39. (I even tried doing a git bisection at 
one point, but being an intermittent problem it's difficult to determine when a 
particular commit doesn't exhibit it - I think I screwed it up at some point.)

Any more ideas? As I said, I tried getting kdump working but have been having 
trouble getting it to behave.

Peace,
Brendon


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201203031228.44898.blhigg...@gmail.com



Re: How do I debug kernel panic that occurs while running X?

2012-03-01 Thread Charles Krinke
On the next boot, /var/log/messages shoild contain the last printk's from
the kernel which would include any panic.

So, one should be able to tail /var/log messages and see what the kernel
did at the time of the freeze.

Remembdr that the fresh boot is appendex to /var/log/messages, so you need
to scroll back a hundred lines or so.

Charles
On Mar 1, 2012 8:41 PM, Brendon Higgins blhigg...@gmail.com wrote:

 Hi list,

 For the better part of a year, now, something has been causing my machine
 to
 freeze. The mouse stops moving on the screen, pressing any key (including
 keys
 that should toggle lights) does nothing. The freezes are intermittent,
 without
 warning, and I've been unable to determine if there is any particular
 cause.

 I think the kernel is panicking, but I can't tell for sure. I don't think
 it
 caused by my hardware, either, because a Windows 7 install (Wintendo)
 seems to
 operate fine. The problem has never happened while I've been using the
 console,
 mostly because I'm there very rarely and I do the vast majority of my work
 in
 X. It's a desktop machine, after all.

 If it weren't for the fact of X being in the way when this happens, I
 might be
 far closer to finding the root cause of the problem I'm seeing. But the
 fact
 that I am unable to get any information at all from the kernel when the
 freeze
 occurs means I haven't been able to get anywhere with it in all this time.
 And
 yet it happens about once every few days. It's terrifically frustrating.

 I tried to get kdump working. I got as far as getting kexec running, and
 kdump
 claims to successfully load its kernel, but when I either manually cause a
 test panic or the bug happens, the kernel fails to start new, and so kdump
 never gets a chance to do its thing. kexec works fine to perform a regular
 restart the machine, though - which is irritating, actually, because it
 gets
 in the way when I wish to reboot into Wintendo.

 This issue is actually beginning to cause me some distress. There must be a
 way to extract panic info when X is running - how would the graphics driver
 writers debug things, otherwise?

 So does anyone have any suggestions as to how I can make some progress on
 diagnosing this?

 I'd appreciate being CC'd on replies, as I'm not sub'd to the list. Thanks!

 Peace,
 Brendon


 --
 To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
 with a subject of unsubscribe. Trouble? Contact
 listmas...@lists.debian.org
 Archive: http://lists.debian.org/201203012323.50867.blhigg...@gmail.com




Re: How do I debug kernel panic that occurs while running X?

2012-03-01 Thread Jason Heeris
On 2 March 2012 12:50, Charles Krinke charles.kri...@gmail.com wrote:
 So, one should be able to tail /var/log messages and see what the kernel did
 at the time of the freeze.

I've had problems with write caching causing the last few messages to
be lost after a panic*, so if you don't see anything suspicious, maybe
turn off write caching with 'hdparm -W 0 /dev/whatever' for long
enough to reproduce the crash. Just in case.

* This was really only a problem because I was using a flash-based
drive (not USB, but for an embedded system) with *really* aggressive
caching. I don't know how bad the problem might be on a more ordinary
drive.

— Jason


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/ca+zd3fewattbjkp1km6armnkuhjmqfqce8cj3aaurf2dbjb...@mail.gmail.com