Re: [CentOS] Fixing to bite the dust?

2009-05-28 Thread sam
Scott,

All the memory is first rate stuff, DDR2 and new.  Not sure if it 
*could* be a memory problem, as I've not dropped back to the 256 config 
to see if they go away but the whole machine crashed last night, and am 
now just going to do a memtest86, then reinstall.  I never did quite 
like the way I partitioned things in the get-go, plus I'll add a raid 
device to boot.

Thanks...

Sam
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-28 Thread Scott Silva
on 5-28-2009 4:49 AM sam spake the following:
 Scott,
 
 All the memory is first rate stuff, DDR2 and new.  Not sure if it 
 *could* be a memory problem, as I've not dropped back to the 256 config 
 to see if they go away but the whole machine crashed last night, and am 
 now just going to do a memtest86, then reinstall.  I never did quite 
 like the way I partitioned things in the get-go, plus I'll add a raid 
 device to boot.
 
 Thanks...
 
 Sam
Is the memory on the list for the motherboard? Some newer cutting edge boards
kind of pushed their specs a bit to tweak some extra speed and they have a
list of known good ram parts that they recommend. This is usually most
common with desktop boards that have been pressed into service as a server,
which is less than ideal, since they really weren't designed for constant use.

Also, hopefully the system is balanced with the ram in proper slots, since
most ddr2 boards like matching pairs.

I'd let the memtest run as long as you can, and maybe cover a few slots on the
system if you can to get the heat up a bit, or at least leave all the covers
on if it is on a bench.



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-27 Thread Scott Silva
on 5-22-2009 5:18 PM sam spake the following:
 I dunno Nate,
 
 It started all this extraenous logging stuff after kicking memory 
 from a paltry 256m up to 2 Gigs. The system has on occasion crashed with 
 little else in the log files that would indicate any kind of other 
 hardware problem, so with all the rejected packets or partial entries 
 (none which showed up on this particular snippett leads me to believe 
 it's dropping sync somehow.  I'll watch the further logs and stuff to 
 see if I can find something more definitive.
 
 Thanks..
 
 Sam
Is the memory compatible? Quality memory or swapmeet crap?
Adequate power supply or also swapmeet junk.



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-24 Thread Dag Wieers

On Fri, 22 May 2009, Lanny Marcus wrote:


On Fri, May 22, 2009 at 4:35 PM, sam s...@wa4phy.net wrote:


I've been getting LOTS of messages like the below in the daily log, and
from all indications, it appears to all be related to the cpu;
the machine is just over a year old, and was the old vortex.wa4phy;net
server from the downtown co-lo site.  Aside from huge log files, and
lots of other fluff, numerous problems of other nature have started
cropping up.  Anyone have any suggestions as to what to do besides make
a boat anchor out of it?  It's all greek to me, so I'm totally at the
mercy of those folks who understand the bits and bites (lit) that have
run amok in things.  Any suggestions as a possible fault other than the
cpu just becoming more toasty brown? :)


To me, it looks like the messages all have to do with communications.
What are the other problems that you say have started cropping up?
Are those things in the logs or the box rebooting by itself or some
other issue?


Setting up diskdump (or preferably) netconsole from netdump may help you 
find the root-cause. We retrofitted all 400 systems with netconsole 
at one customer after we were stuck finding why one server died on us. If 
it is not reproducable you lost your (maybe only) opportunity to have it 
fixed.


So in the end the effort of setting up netconsole for a company's 
datacenter is something you better do from the start, because the moment 
you really need it, it's already too late :)


--
--   dag wieers,  d...@centos.org,  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-23 Thread William L. Maltby

On Fri, 2009-05-22 at 20:18 -0400, sam wrote:
 I dunno Nate,
 
 It started all this extraenous logging stuff after kicking memory 
 from a paltry 256m up to 2 Gigs. The system has on occasion crashed with 
 little else in the log files that would indicate any kind of other 
 hardware problem, so with all the rejected packets or partial entries 
 (none which showed up on this particular snippett leads me to believe 
 it's dropping sync somehow.  I'll watch the further logs and stuff to 
 see if I can find something more definitive.

Along with the memtest86 mentioned in another reply, consider this.

When I upgraded the memory on my CentOS 4.x box, an Acer Ak77-400
(Max/N) unit with the via kt-400 chipset, last year to 2GB, it wouldn't
work. To make a long story short, the DIMMS had no specs. The auto was
set in BIOS for voltage. My fix was to change that to manual and bump
the voltage a couple of tenths. More memory draws more amperage, may
need more voltage to drive sufficient amperage.

*MAKE SURE YOU DON'T VOID YOUR WARRANTY*

*If* this tweak fries your memory (one or two tenths should not), you
might be SOL. You might want to check the specs on the memory, check
with your vendor about the issue (including warranty), etc. before
proceding.

 
 Thanks..
 
 Sam
 snip sig stuff

HTH
-- 
Bill

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-22 Thread nate
sam wrote:
 I've been getting LOTS of messages like the below in the daily log, and
 from all indications, it appears to all be related to the cpu;
 the machine is just over a year old, and was the old vortex.wa4phy;net
 server from the downtown co-lo site.  Aside from huge log files, and
 lots of other fluff, numerous problems of other nature have started
 cropping up.  Anyone have any suggestions as to what to do besides make
 a boat anchor out of it?  It's all greek to me, so I'm totally at the
 mercy of those folks who understand the bits and bites (lit) that have
 run amok in things.  Any suggestions as a possible fault other than the
 cpu just becoming more toasty brown? :)

Those messages don't look particularly bad, is the system
crashing or something?

It seems like some sort of packet logging, perhaps iptables log
rules or something?

I get messages like this in my logs all the time, no harm done:
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=125.89.73.173
DST=209.90.228.140 LEN=40 TOS=0x00 PREC=0x00 TTL=104 ID=256 PROTO=TCP
SPT=6000 DPT=3306 WINDOW=16384 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=74.160.133.83
DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=112 ID=14491 DF PROTO=TCP
SPT=4664 DPT=139 WINDOW=64512 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=74.160.133.83
DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=112 ID=15210 DF PROTO=TCP
SPT=4664 DPT=139 WINDOW=64512 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=95.67.143.212
DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=109 ID=6209 DF PROTO=TCP
SPT=4604 DPT=139 WINDOW=65535 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=95.67.143.212
DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=109 ID=6884 DF PROTO=TCP
SPT=4604 DPT=139 WINDOW=65535 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00
SRC=207.183.171.254 DST=209.90.228.140 LEN=52 TOS=0x00 PREC=0x00 TTL=51
ID=1415 DF PROTO=TCP SPT=14669 DPT=139 WINDOW=60352 RES=0x00 SYN URGP=0
IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00
SRC=207.183.171.254 DST=209.90.228.140 LEN=52 TOS=0x00 PREC=0x00 TTL=51
ID=2157 DF PROTO=TCP SPT=14669 DPT=139 WINDOW=60352 RES=0x00 SYN URGP=0

nate


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-22 Thread Lanny Marcus
On Fri, May 22, 2009 at 4:35 PM, sam s...@wa4phy.net wrote:
 I've been getting LOTS of messages like the below in the daily log, and
 from all indications, it appears to all be related to the cpu;
 the machine is just over a year old, and was the old vortex.wa4phy;net
 server from the downtown co-lo site.  Aside from huge log files, and
 lots of other fluff, numerous problems of other nature have started
 cropping up.  Anyone have any suggestions as to what to do besides make
 a boat anchor out of it?  It's all greek to me, so I'm totally at the
 mercy of those folks who understand the bits and bites (lit) that have
 run amok in things.  Any suggestions as a possible fault other than the
 cpu just becoming more toasty brown? :)

To me, it looks like the messages all have to do with communications.
What are the other problems that you say have started cropping up?
Are those things in the logs or the box rebooting by itself or some
other issue?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-22 Thread sam
I dunno Nate,

It started all this extraenous logging stuff after kicking memory 
from a paltry 256m up to 2 Gigs. The system has on occasion crashed with 
little else in the log files that would indicate any kind of other 
hardware problem, so with all the rejected packets or partial entries 
(none which showed up on this particular snippett leads me to believe 
it's dropping sync somehow.  I'll watch the further logs and stuff to 
see if I can find something more definitive.

Thanks..

Sam


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Fixing to bite the dust?

2009-05-22 Thread nate
sam wrote:
 I dunno Nate,

 It started all this extraenous logging stuff after kicking memory
 from a paltry 256m up to 2 Gigs. The system has on occasion crashed with
 little else in the log files that would indicate any kind of other
 hardware problem, so with all the rejected packets or partial entries
 (none which showed up on this particular snippett leads me to believe
 it's dropping sync somehow.  I'll watch the further logs and stuff to
 see if I can find something more definitive.

System crashes often do not log anything.. if possible connect
a serial console to the system and configure it to put the console
on the serial port, then connect that to another system or terminal
server..

Or you can run something like memtest86, though it can take a while
sometimes to trace memory errors(days). There is a burn-in test
suite that a lot of vendors use created by VA Linux a long time
ago called Cerberus (ctcs), you can find it on sourceforge, in
my experience if the hardware is bad ctcs will crash the system
in a matter of hours every time(assuming motherboard/cpu/ram),
and it helps catch bad disk controllers as well as disks too,
putting the system under incredible strain.

nate


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos