I'm in the process of building a new server which is a Dual Xeon 2.4GHz with 1Gb of ECC/Registered memory. The CPUs are hyperthreading, meaning they each show up as 2 CPUs in Linux, making there a total of 4 CPUs visible.
It had been up and running well for a week whilst I moved things from other servers to it, then I couldn't ssh to it the other day. Upon further investigation (the box was still usable from console), I found the below in /var/log/messages: Mar 26 12:14:30 meepmeep kernel: init S F7EFDF2C 4696 1 0 10312 (NOTLB) Mar 26 12:14:30 meepmeep kernel: Call Trace: [schedule_timeout+122/156] [process_timeout+0/96] [do_select+458/516] [sys_select+810/1132] [system_call+51/56] Mar 26 12:14:30 meepmeep kernel: keventd S F7EEE664 5984 2 1 3 (L-TLB) Mar 26 12:14:30 meepmeep kernel: Call Trace: [context_thread+277/464] [kernel_thread+40/56] Mar 26 12:14:30 meepmeep kernel: ksoftirqd_CPU S F7EEA000 5880 3 1 4 2 (L-TLB) Mar 26 12:14:30 meepmeep kernel: Call Trace: [do_softirq+111/204] [ksoftirqd+147/200] [kernel_thread+40/56] etc. Full log: http://www3.secret.com.au:83/users/andypoo/more/wtf.1214.txt It seems every process on the system seems to have faulted in some way, but I'm not familiar with this style of kernel failure. If anybody could point me into the direction of what this all means, or possible symptoms, I'd be greatly appreciated. The box was due for colocation in a week aswell, looks like I'll have to run it in for another month. Thanks, Andypoo. -- SLUG - Sydney Linux User's Group - http://slug.org.au/ More Info: http://lists.slug.org.au/listinfo/slug