my 2-cents: Might want to check filesystem integrity too (e.g: fsck,
xfs_check).
Amit
Alan McKinnon wrote:
On Saturday 28 November 2009 22:53:52 Harry Putnam wrote:
I keep having a problem where the OS becomes inaccessable after
running in X for a while. I haven't noticed a time pattern yet but it
doesn't take long sometimes.
Today I started from an OFF machine, booted up, started X did a few
things A few minutes later I attempted to login via ssh from a remote
laptop down stairs. The os is inaccessable via ssh, or port 25 (its
also a mailhup for home lan).
Went back to the actual machine and it is inaccessable from console as
well.
It's happened repeatedly now for a week or two, but I've been busy with
other stuff, and if I need it running I've just left it in console
mode.
The problem apparently does not occur in console mode.
I see no problem when starting X and I see nothing in
/var/log/messages that gives a clue about what is happening.
I'm running fairly up to date Desktop profile on kernel:
(uname -a)
Linux reader 2.6.31-gentoo-r4_rdr-5 #6 SMP
Wed Nov 4 09:19:17 CST 2009 i686 Intel(R) Celeron(R)
CPU 3.06GHz GenuineIntel GNU/Linux
I'm not sure how to track down the problem since I'm not seeing any
give away clues in /var/log/messages
So far, once the lockup has happened it appears there is no way in
other than the reboot switch.
Looks like you need more info for a diagnosis. Unfortunately this is a hit and
miss game as we don't have much clue what's going on. The lack of anything
valuable in /var/log/messages seems to indicate that either a) no syslog
messages were generated (common with client apps) or b) there is a message but
the system locks up before it can be flushed to disk.
Some ideas:
Set up an ssh session to the offending machine from a different machine that
is permanently on. Wait for the problem to occur and see if anything got
printed on the ssh console.
Set up a syslogger on a remote machine and send all your logs to it. If that
produces nothing, try having the local syslogger replicate ~/.xsession-errors
to the remote logger. I often find that remote logging manages to keep working
after the local disk has given up.
Obviously, these are long range diagnosis techniques and you have to be
patient. "emerge -e world" will take around 24 hours and may well fix your
problem, but not tell you what the cause was.