I believe you will find that fmd has disabled this DIMM or DIMMs.
If you know how much memory is supposed to be there you can check the
output of 'prtconf", or, preferably, that of "prtdiag -v" and compare that to what you believe is physically installed.
The last time I read the fmadm manpage, as I recall, end users are
expected to NOT run fmadm repair unless specifically instructed to do so
by Sun XXX Oracle support staff. It *may* *not* need to be run at all when this DIMM is replaced (various Sun Sparc hardware seems to behave each a bit differently with respect to fault management in my experience)

Certainly, you do NOT run 'fmadm repair' before actually replacing the DIMMs identified as failing.

Unless you suffer additional error rate problems with other DIMMs I would not expect another panic induced reboot.

On 03/11/11 14:17, Paul Robertson wrote:
Our V890 server reported a memory fault, rebooted, and now shows the
following:

csgams08:~>sudo fmadm faulty Password: ---------------
------------------------------------  -------------- --------- TIME
EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  --------------
--------- Mar 11 00:21:46 7acef7a1-6c9e-49db-9b03-ff6f0d5f911d
SUN4U-8000-35  Critical

Fault class : fault.memory.bank 95% Affects     :
mem:///unum=Slot,B:J8100,J8101,J8201,J8200 degraded but still in
service FRU         : mem:///unum=Slot,B:J8100,J8101,J8201,J8200 95%
Serial ID.  :

Description : The number of errors associated with this memory module
has exceeded acceptable levels.  Refer to
http://sun.com/msg/SUN4U-8000-35 for more information.

Response    : Pages of memory associated with this memory module are
being removed from service as errors are reported.

Impact      : Total system memory capacity will be reduced as pages
are retired.

Action      : Schedule a repair procedure to replace the affected
memory module. Use fmdump -v -u<EVENT_ID>  to identify the module.

We've scheduled the replacement already, but I want to understand
whether fmd has effectively disabled these dimms until such time as
we run "fmadm repair". In other words, is it likely that we'll get
another failure/reboot before we can schedule the maintenance? If so,
I guess we'll try and asr-disable these dimms to minimize the risk.

Please advise.

Paul

--
Jerry Sutton    jer...@airmail.net
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to