Hello, Matt. >>The vast majority of panics are hardware-related. It is rare nowadays >>for a usermode program to make the system panic. In particular you said >>the problem happens more under load. That really points even more to a >>hardware problem - bad CPU cache ram, bad ram, scsi termination, that >>sort of thing. >> >>Ted >> >>
> This is kind of going to be a blanket post to all the recent suggestions > to me. I appreciate suggestions :) Ted, sorry, my other posts had > dmesg and hardware specs, etc. I just couldn't remember the subject line > of that thread. I'll be more descriptive here. > We have two different servers crashing. Both are SMP, but on different > hardware. We have five freeBSD servers in total, and only two are > affected. That is why I do not believe this is a hardware problem. > In any case, the machines are in a cold room where the temperature is > constantly maintained. 20 other servers in there are perfectly stable, > with no probs. > This particular machine that crashed last night while running portsdb > -uU is a Super Micro machine, with hyperthreading disabled in the bios, > dual CPU 3.06 ghz, with 4 gigs memory. We ran mem test on orion (the > machine that crashed last night) a week or so ago, and it found 70,000 > ECC errors. Those were fixed and that machine has been stable until > last night. I've now disabled SMP support, we'll see if that keeps it > stable or not. Portsdb -uU ran without problems after I disabled SMP. > As far as uranus, the other box (we keep a planet scheme for a certain > set of servers), we ran memtest86 and found no errors at all. That box > crashed about two days ago but has been stable since. It has not lasted > more than a week without doing a kernel trap and freezing. > It seems that both these servers have this problem. Out of the five > FreeBSD servers we have, these two are the ones with the highest load. > Maybe a higher load on the other three servers would cause the same > problem. I agree with you that this is a hardware problem, but on more > than one server with two different architectures and our highest load > makes me re-consider. > If this is truly a bug in FreeBSD 5.4-RELEASE, maybe this is something > that has been fixed in -stable? I will compile a debug kernel today and > try to provide a trace to the problem. I'll do it on which ever server > crashes next. I had same situation with to different high loaded servers (both SMP, with 8Gb of ram, and HT enabled,), with 5.4 Release, after disabeling HT and cvsup OS to 5.4-stable all working fine without any problems, last reboot was 28 days ago. > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "[EMAIL PROTECTED]" -- Best regards, Sergey S. Ropchan _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"