Re: [CentOS] Fixing to bite the dust?
Scott, All the memory is first rate stuff, DDR2 and new. Not sure if it *could* be a memory problem, as I've not dropped back to the 256 config to see if they go away but the whole machine crashed last night, and am now just going to do a memtest86, then reinstall. I never did quite like the way I partitioned things in the get-go, plus I'll add a raid device to boot. Thanks... Sam ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
on 5-28-2009 4:49 AM sam spake the following: Scott, All the memory is first rate stuff, DDR2 and new. Not sure if it *could* be a memory problem, as I've not dropped back to the 256 config to see if they go away but the whole machine crashed last night, and am now just going to do a memtest86, then reinstall. I never did quite like the way I partitioned things in the get-go, plus I'll add a raid device to boot. Thanks... Sam Is the memory on the list for the motherboard? Some newer cutting edge boards kind of pushed their specs a bit to tweak some extra speed and they have a list of known good ram parts that they recommend. This is usually most common with desktop boards that have been pressed into service as a server, which is less than ideal, since they really weren't designed for constant use. Also, hopefully the system is balanced with the ram in proper slots, since most ddr2 boards like matching pairs. I'd let the memtest run as long as you can, and maybe cover a few slots on the system if you can to get the heat up a bit, or at least leave all the covers on if it is on a bench. signature.asc Description: OpenPGP digital signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
on 5-22-2009 5:18 PM sam spake the following: I dunno Nate, It started all this extraenous logging stuff after kicking memory from a paltry 256m up to 2 Gigs. The system has on occasion crashed with little else in the log files that would indicate any kind of other hardware problem, so with all the rejected packets or partial entries (none which showed up on this particular snippett leads me to believe it's dropping sync somehow. I'll watch the further logs and stuff to see if I can find something more definitive. Thanks.. Sam Is the memory compatible? Quality memory or swapmeet crap? Adequate power supply or also swapmeet junk. signature.asc Description: OpenPGP digital signature ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
On Fri, 22 May 2009, Lanny Marcus wrote: On Fri, May 22, 2009 at 4:35 PM, sam s...@wa4phy.net wrote: I've been getting LOTS of messages like the below in the daily log, and from all indications, it appears to all be related to the cpu; the machine is just over a year old, and was the old vortex.wa4phy;net server from the downtown co-lo site. Aside from huge log files, and lots of other fluff, numerous problems of other nature have started cropping up. Anyone have any suggestions as to what to do besides make a boat anchor out of it? It's all greek to me, so I'm totally at the mercy of those folks who understand the bits and bites (lit) that have run amok in things. Any suggestions as a possible fault other than the cpu just becoming more toasty brown? :) To me, it looks like the messages all have to do with communications. What are the other problems that you say have started cropping up? Are those things in the logs or the box rebooting by itself or some other issue? Setting up diskdump (or preferably) netconsole from netdump may help you find the root-cause. We retrofitted all 400 systems with netconsole at one customer after we were stuck finding why one server died on us. If it is not reproducable you lost your (maybe only) opportunity to have it fixed. So in the end the effort of setting up netconsole for a company's datacenter is something you better do from the start, because the moment you really need it, it's already too late :) -- -- dag wieers, d...@centos.org, http://dag.wieers.com/ -- [Any errors in spelling, tact or fact are transmission errors]___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
On Fri, 2009-05-22 at 20:18 -0400, sam wrote: I dunno Nate, It started all this extraenous logging stuff after kicking memory from a paltry 256m up to 2 Gigs. The system has on occasion crashed with little else in the log files that would indicate any kind of other hardware problem, so with all the rejected packets or partial entries (none which showed up on this particular snippett leads me to believe it's dropping sync somehow. I'll watch the further logs and stuff to see if I can find something more definitive. Along with the memtest86 mentioned in another reply, consider this. When I upgraded the memory on my CentOS 4.x box, an Acer Ak77-400 (Max/N) unit with the via kt-400 chipset, last year to 2GB, it wouldn't work. To make a long story short, the DIMMS had no specs. The auto was set in BIOS for voltage. My fix was to change that to manual and bump the voltage a couple of tenths. More memory draws more amperage, may need more voltage to drive sufficient amperage. *MAKE SURE YOU DON'T VOID YOUR WARRANTY* *If* this tweak fries your memory (one or two tenths should not), you might be SOL. You might want to check the specs on the memory, check with your vendor about the issue (including warranty), etc. before proceding. Thanks.. Sam snip sig stuff HTH -- Bill ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
sam wrote: I've been getting LOTS of messages like the below in the daily log, and from all indications, it appears to all be related to the cpu; the machine is just over a year old, and was the old vortex.wa4phy;net server from the downtown co-lo site. Aside from huge log files, and lots of other fluff, numerous problems of other nature have started cropping up. Anyone have any suggestions as to what to do besides make a boat anchor out of it? It's all greek to me, so I'm totally at the mercy of those folks who understand the bits and bites (lit) that have run amok in things. Any suggestions as a possible fault other than the cpu just becoming more toasty brown? :) Those messages don't look particularly bad, is the system crashing or something? It seems like some sort of packet logging, perhaps iptables log rules or something? I get messages like this in my logs all the time, no harm done: IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=125.89.73.173 DST=209.90.228.140 LEN=40 TOS=0x00 PREC=0x00 TTL=104 ID=256 PROTO=TCP SPT=6000 DPT=3306 WINDOW=16384 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=74.160.133.83 DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=112 ID=14491 DF PROTO=TCP SPT=4664 DPT=139 WINDOW=64512 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=74.160.133.83 DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=112 ID=15210 DF PROTO=TCP SPT=4664 DPT=139 WINDOW=64512 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=95.67.143.212 DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=109 ID=6209 DF PROTO=TCP SPT=4604 DPT=139 WINDOW=65535 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=95.67.143.212 DST=209.90.228.140 LEN=48 TOS=0x00 PREC=0x00 TTL=109 ID=6884 DF PROTO=TCP SPT=4604 DPT=139 WINDOW=65535 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=207.183.171.254 DST=209.90.228.140 LEN=52 TOS=0x00 PREC=0x00 TTL=51 ID=1415 DF PROTO=TCP SPT=14669 DPT=139 WINDOW=60352 RES=0x00 SYN URGP=0 IN=eth0 OUT= MAC=00:0c:29:68:2f:4a:00:0b:bf:73:84:1b:08:00 SRC=207.183.171.254 DST=209.90.228.140 LEN=52 TOS=0x00 PREC=0x00 TTL=51 ID=2157 DF PROTO=TCP SPT=14669 DPT=139 WINDOW=60352 RES=0x00 SYN URGP=0 nate ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
On Fri, May 22, 2009 at 4:35 PM, sam s...@wa4phy.net wrote: I've been getting LOTS of messages like the below in the daily log, and from all indications, it appears to all be related to the cpu; the machine is just over a year old, and was the old vortex.wa4phy;net server from the downtown co-lo site. Aside from huge log files, and lots of other fluff, numerous problems of other nature have started cropping up. Anyone have any suggestions as to what to do besides make a boat anchor out of it? It's all greek to me, so I'm totally at the mercy of those folks who understand the bits and bites (lit) that have run amok in things. Any suggestions as a possible fault other than the cpu just becoming more toasty brown? :) To me, it looks like the messages all have to do with communications. What are the other problems that you say have started cropping up? Are those things in the logs or the box rebooting by itself or some other issue? ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
I dunno Nate, It started all this extraenous logging stuff after kicking memory from a paltry 256m up to 2 Gigs. The system has on occasion crashed with little else in the log files that would indicate any kind of other hardware problem, so with all the rejected packets or partial entries (none which showed up on this particular snippett leads me to believe it's dropping sync somehow. I'll watch the further logs and stuff to see if I can find something more definitive. Thanks.. Sam ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Fixing to bite the dust?
sam wrote: I dunno Nate, It started all this extraenous logging stuff after kicking memory from a paltry 256m up to 2 Gigs. The system has on occasion crashed with little else in the log files that would indicate any kind of other hardware problem, so with all the rejected packets or partial entries (none which showed up on this particular snippett leads me to believe it's dropping sync somehow. I'll watch the further logs and stuff to see if I can find something more definitive. System crashes often do not log anything.. if possible connect a serial console to the system and configure it to put the console on the serial port, then connect that to another system or terminal server.. Or you can run something like memtest86, though it can take a while sometimes to trace memory errors(days). There is a burn-in test suite that a lot of vendors use created by VA Linux a long time ago called Cerberus (ctcs), you can find it on sourceforge, in my experience if the hardware is bad ctcs will crash the system in a matter of hours every time(assuming motherboard/cpu/ram), and it helps catch bad disk controllers as well as disks too, putting the system under incredible strain. nate ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos