Rahul Dhesi writes:
 > SunOS deals with soft memory errors in a very nice way.  After a certain
 > number of soft memory errors have occurred, it syslog's a message saying
 > essentially:
 > 
 >    XXX corrected memory errors on memory chip YYY
 > 
 > where XXX is how many times an error was corrected and YYY is
 > the location of the chip on the motherboard.
 > 
 > I think this is a very nice strategy.  It avoids too many false warnings
 > but still alerts the operator to consider replacing an unreliable memory
 > chip.  The same strategy could be used in any situation where
 > correctable errors are occurring:  Simply keep count of them and log a
 > warning when a threshold is reached.  

But it is important to have the block number for each retry (because
the conclusions are different if it's always the same or all over the
disk).

You can't just keep a count of errors, you have to keep specific data
about each error, which makes things more difficult for the driver,
because there are many more disk blocks than memory chips.

Logging to a disk file is probably a lot easier. You can then use your 
preferred extraction and report language to do stats.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message

Reply via email to