on occasion i have systems spontaneously rebooting. i can often find entries
like this in fault management but it is not particularly helpful. i suspect
there is really nothing wrong and the software is generating a panic and
rebooting. is there a way to mask this from any type of action or figure out
what the source of the issue is?
in this particularly case, i watched the system dump 96gb of ram on to a
dedicated dump device. however, i was unable to retrieve the data afterwards
and received a message from savecore that read something like 'save core: bad
magic number b'
any insights would be appreciated.
thanks,
j.
root@db017:~# fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Aug 25 20:08:47 6c3020a1-e7bf-69e3-ab37-cb68d4324a0e SUNOS-8000-J0 Major
Host : db017
Platform : S5520UR Chassis_id : ............
Product_sn :
Fault class : defect.sunos.eft.unexpected_telemetry 50%
fault.sunos.eft.unexpected_telemetry 50%
Problem in : dev:////pci@0,0
faulted and taken out of service
Description : The diagnosis engine encountered telemetry from the listed
devices for which it was unable to perform a diagnosis -
Refer to http://sun.com/msg/SUNOS-8000-J0 for more information.
Refer to http://sun.com/msg/SUNOS-8000-J0 for more information.
Response : Error reports have been logged for examination by Sun.
Impact : Automated diagnosis and response for these events will not occur.
Action : Ensure that the latest Solaris Kernel and Predictive Self-Healing
(PSH) patches are installed.
_______________________________________________
OpenIndiana-discuss mailing list
[email protected]
http://openindiana.org/mailman/listinfo/openindiana-discuss