I've been seeing a smattering of these in the error logs on an x4500: Sep 11 15:12:08.7007 ereport.cpu.amd.nb.mem_ce Sep 12 03:27:19.1011 ereport.cpu.amd.nb.mem_ce Sep 12 09:34:49.3013 ereport.cpu.amd.nb.mem_ce Sep 12 15:42:19.4911 ereport.cpu.amd.nb.mem_ce Sep 12 21:49:59.6950 ereport.cpu.amd.nb.mem_ce Sep 13 03:57:29.8841 ereport.cpu.amd.nb.mem_ce Sep 13 10:05:00.0976 ereport.cpu.amd.nb.mem_ce Sep 13 16:12:40.2817 ereport.cpu.amd.nb.mem_ce Sep 13 22:20:10.4972 ereport.cpu.amd.nb.mem_ce Sep 14 10:35:20.8700 ereport.cpu.amd.nb.mem_ce Sep 14 22:50:21.2817 ereport.cpu.amd.nb.mem_ce Sep 15 04:58:01.4661 ereport.cpu.amd.nb.mem_ce Sep 15 11:05:31.6787 ereport.cpu.amd.nb.mem_ce
Nothing's been reported as faulted, but they are occurring on a pretty regular basis. I was digging around trying to find out what these mean, the only thing I really found was: http://opensolaris.org/os/project/generic-mca/docs/portfolio/diagnosis/ I think they're correctable ECC errors? The link above indicates they are diagnosed by "amd64.esc", but I haven't been able to find any details on how many correctable errors have to occur before something is considered faulted. In another part the above page says "The number of such page_sb faults is counted for each chip-select, and when any chip-select has more than 64 pages faulted in this way we fault the chip-select with a fault.memory.generic-x86.dimm_sb", which I think indicates there has to be more than 64 correctable failures before a fault is generated? Is that correct? Is there a better source of documentation on this? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 _______________________________________________ fm-discuss mailing list fm-discuss@opensolaris.org