Re: diagnosing system crash (hardware failure?)

2004-01-17 Thread Haines Brown
I agree with the person who just suggested that your system freeze
sounds like flakey RAM, but my experience has been more typically a
system crash with that as its cause.

Last year I went though a tough time with keyboard/mouse freezes, and
it turned out to be a flakey nVidia GeForce video card. This was the
second nVidia card I had to return under warranty. I note that you
have an nVidia card, and if you can easily swap in another video card,
I suggest you try that if swapping RAM is not the answer. 

Haines Brown 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: diagnosing system crash (hardware failure?)

2004-01-16 Thread Micha Feigin
On Fri, Jan 16, 2004 at 07:09:23PM -0500, Matt Price wrote:
> Hi everyone,
> 
> my work machine has been crashing spontaneously: X freezes, sshd goes
> down, and I can't use the keyboard.  This only happens
> when Im in the office, so I think it likely has something to do with
> my physical presence...  
> 

Start with the basic diagnose.
Do you see any flashing light (most probably caps lock + scroll
lock). This would most probably signify an oops and the computer is
usually still at a state that it can be captured.
Are the num lock and/or caps lock functional (do they light up the
light on the keyboard), if so, then the machine is at list somewhat
functional.
Any of these would mean that the problem can at list be recorded as a
start.
Depending on how hard the lockup is things like netdump, kdb and sysrq.

> In particular, we have a USB-kvm switch that I use to switch between
> the workstation and a webstation.  I use it rarely, except when
> something seems to be wrong with my desktop...  which has been
> happening a lot lately...  
> 
> Anyway, I can't figure out the significance of the pre-crash system
> messages.  Here's a representtive sample:
> 
> Jan 16 17:08:01 pc09 postgres[1590]: [8] LOG:  database system is
> ready 
> Jan 16 17:08:03 pc09 xfs: CONFIG: extra value for parameter
> "cache-balance"  
> Jan 16 17:08:04 pc09 xfs: ignoring font path element
> /usr/lib/X11/fonts/cyrillic/ (unreadable)  
> Jan 16 17:08:04 pc09 xfs: ignoring font path element
> /usr/lib/X11/fonts/CID (unreadable)  
> Jan 16 17:08:13 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8310 PROTO=2  
> Jan 16 17:08:20 pc09 kernel: 0: nvidia: loading NVIDIA Linux x86
> nvidia.o Kernel Module  1.0-4496  Wed Jul 16 19:03:09 PDT 2003 
> Jan 16 17:10:18 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8319 PROTO=2  
> Jan 16 17:12:23 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8329 PROTO=2  
> Jan 16 17:14:28 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8337 PROTO=2  
> Jan 16 17:16:33 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8348 PROTO=2  
> Jan 16 17:18:38 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8373 PROTO=2  
> Jan 16 17:20:43 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8381 PROTO=2  
> Jan 16 17:27:40 pc09 syslogd 1.4.1#13: restart. 
> Jan 16 17:27:40 pc09 kernel: klogd 1.4.1#13, log
> 
> ...
> 
> and another:
> 
> Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbkbd already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver hid already loaded
> 
> Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbmouse already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver keybdev already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver mousedev already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbkbd already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver hid already loaded
> 
> Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbmouse already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver keybdev already
> loaded 
> Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver mousedev already
> loaded 
> Jan 16 17:29:02 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8475 PROTO=2  
> Jan 16 17:31:07 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8497 PROTO=2  
> Jan 16 17:33:12 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8515 PROTO=2  
> Jan 16 17:35:17 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8530 PROTO=2  
> Jan 16 16:37:38 pc09 kernel: IN=eth0 OUT=
> MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
> DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8553 PROTO=2  
> Jan 16 17:45:03 pc09 syslogd 1.4.1#13: restart. 
> 
> Now I don't really know what this stuff is telling me, but they don't
> look so bad.  So I'm wondering whether the error, whatever it is,
> isn't being recorded.
>  
> Are there general guidelines as  to where to go next with this kind of
> problem?  It's f

Re: diagnosing system crash (hardware failure?)

2004-01-16 Thread Mac McCaskie
sounds very much like static electricity to me.

are you getting shocks when you touch doorknobs?

if you have a "joe's garage" pc then it may not be handling static very 
well.  You can build up plently of static (enough anyway) while rolling 
around in your chair.

another item might be voltage level, when you are away, do you turn off 
the monitor?

HTH -mac

Matt Price wrote:

Hi everyone,

my work machine has been crashing spontaneously: X freezes, sshd goes
down, and I can't use the keyboard.  This only happens
when Im in the office, so I think it likely has something to do with
my physical presence...  

In particular, we have a USB-kvm switch that I use to switch between
the workstation and a webstation.  I use it rarely, except when
something seems to be wrong with my desktop...  which has been
happening a lot lately...  

Anyway, I can't figure out the significance of the pre-crash system
messages.  Here's a representtive sample:
Jan 16 17:08:01 pc09 postgres[1590]: [8] LOG:  database system is
ready 
Jan 16 17:08:03 pc09 xfs: CONFIG: extra value for parameter
"cache-balance"  
Jan 16 17:08:04 pc09 xfs: ignoring font path element
/usr/lib/X11/fonts/cyrillic/ (unreadable)  
Jan 16 17:08:04 pc09 xfs: ignoring font path element
/usr/lib/X11/fonts/CID (unreadable)  
Jan 16 17:08:13 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8310 PROTO=2  
Jan 16 17:08:20 pc09 kernel: 0: nvidia: loading NVIDIA Linux x86
nvidia.o Kernel Module  1.0-4496  Wed Jul 16 19:03:09 PDT 2003 
Jan 16 17:10:18 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8319 PROTO=2  
Jan 16 17:12:23 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8329 PROTO=2  
Jan 16 17:14:28 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8337 PROTO=2  
Jan 16 17:16:33 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8348 PROTO=2  
Jan 16 17:18:38 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8373 PROTO=2  
Jan 16 17:20:43 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8381 PROTO=2  
Jan 16 17:27:40 pc09 syslogd 1.4.1#13: restart. 
Jan 16 17:27:40 pc09 kernel: klogd 1.4.1#13, log

...

and another:

Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbkbd already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver hid already loaded

Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbmouse already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver keybdev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver mousedev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbkbd already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver hid already loaded

Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbmouse already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver keybdev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver mousedev already
loaded 
Jan 16 17:29:02 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8475 PROTO=2  
Jan 16 17:31:07 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8497 PROTO=2  
Jan 16 17:33:12 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8515 PROTO=2  
Jan 16 17:35:17 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8530 PROTO=2  
Jan 16 16:37:38 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8553 PROTO=2  
Jan 16 17:45:03 pc09 syslogd 1.4.1#13: restart. 

Now I don't really know what this stuff is telling me, but they don't
look so bad.  So I'm wondering whether the error, whatever it is,
isn't being recorded.
 
Are there general guidelines as  to where to go next with this kind of
problem?  It's fairly annoying...

thanks,
matt



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



diagnosing system crash (hardware failure?)

2004-01-16 Thread Matt Price
Hi everyone,

my work machine has been crashing spontaneously: X freezes, sshd goes
down, and I can't use the keyboard.  This only happens
when Im in the office, so I think it likely has something to do with
my physical presence...  

In particular, we have a USB-kvm switch that I use to switch between
the workstation and a webstation.  I use it rarely, except when
something seems to be wrong with my desktop...  which has been
happening a lot lately...  

Anyway, I can't figure out the significance of the pre-crash system
messages.  Here's a representtive sample:

Jan 16 17:08:01 pc09 postgres[1590]: [8] LOG:  database system is
ready 
Jan 16 17:08:03 pc09 xfs: CONFIG: extra value for parameter
"cache-balance"  
Jan 16 17:08:04 pc09 xfs: ignoring font path element
/usr/lib/X11/fonts/cyrillic/ (unreadable)  
Jan 16 17:08:04 pc09 xfs: ignoring font path element
/usr/lib/X11/fonts/CID (unreadable)  
Jan 16 17:08:13 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8310 PROTO=2  
Jan 16 17:08:20 pc09 kernel: 0: nvidia: loading NVIDIA Linux x86
nvidia.o Kernel Module  1.0-4496  Wed Jul 16 19:03:09 PDT 2003 
Jan 16 17:10:18 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8319 PROTO=2  
Jan 16 17:12:23 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8329 PROTO=2  
Jan 16 17:14:28 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8337 PROTO=2  
Jan 16 17:16:33 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8348 PROTO=2  
Jan 16 17:18:38 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8373 PROTO=2  
Jan 16 17:20:43 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8381 PROTO=2  
Jan 16 17:27:40 pc09 syslogd 1.4.1#13: restart. 
Jan 16 17:27:40 pc09 kernel: klogd 1.4.1#13, log

...

and another:

Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbkbd already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver hid already loaded

Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver usbmouse already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver keybdev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1832]: kernel driver mousedev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbkbd already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver hid already loaded

Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver usbmouse already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver keybdev already
loaded 
Jan 16 17:28:20 pc09 usb.agent[1834]: kernel driver mousedev already
loaded 
Jan 16 17:29:02 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8475 PROTO=2  
Jan 16 17:31:07 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8497 PROTO=2  
Jan 16 17:33:12 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8515 PROTO=2  
Jan 16 17:35:17 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8530 PROTO=2  
Jan 16 16:37:38 pc09 kernel: IN=eth0 OUT=
MAC=01:00:5e:00:00:01:00:50:99:bf:df:18:08:00 SRC=128.100.34.3
DST=224.0.0.1 LEN=28 TOS=0x00 PREC=0x00 TTL=1 ID=8553 PROTO=2  
Jan 16 17:45:03 pc09 syslogd 1.4.1#13: restart. 

Now I don't really know what this stuff is telling me, but they don't
look so bad.  So I'm wondering whether the error, whatever it is,
isn't being recorded.
 
Are there general guidelines as  to where to go next with this kind of
problem?  It's fairly annoying...

thanks,
matt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]