Hi all, I have a Dell R710 server running 8.0R/amd64, with a PERC 6 RAID controller and four SAS drives in a RAID10 configuration. The RAID controller does a weekly "patrol read" that threw up a load of errors in the most recent run:
+======================================================================== +seqNum: 0x00000b5e +Time: Tue Aug 3 22:06:15 2010 + +Code: 0x00000071 +Class: 0 +Locale: 0x02 +Event Description: Unexpected sense: PD 02(e0x20/s2) Path 5000c5000561dfc9, CDB: 2f 00 19 21 40 00 00 10 00 00, Sense: 3/11/00 +Event Data: + Device ID: 2 + Enclosure Index: 32 + Slot Number: 2 + CDB Length: 10 + CDB Data: + 002f 0000 0019 0021 0040 0000 0000 0010 0000 0000 0000 0000 0000 0000 0000 0000 Sense Length: 18 + Sense Data: + 00f0 0000 0003 0019 0021 004b 00e1 000a 0000 0000 0000 0000 0011 0000 0081 0080 0000 0097 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 + +======================================================================== +seqNum: 0x00000b5f +Time: Tue Aug 3 22:06:15 2010 + +Code: 0x0000005d +Class: 0 +Locale: 0x02 +Event Description: Patrol Read corrected medium error on PD 02(e0x20/s2) at 19214be1 +Event Data: + Device ID: 2 + Enclosure Index: 32 + Slot Number: 2 + LBA: 421612513 + +======================================================================== ...and a lot more of the same. Everything else on the machine is still working fine as far as I can tell, and the status of the RAID volume is still reported as "Optimal": Checking status of MFI RAID controllers: Adapter: 0 ------------------------------------------------------------------------ Physical Drive Information: ENC SLO DEV SEQ MEC OEC PFC LPF STATE 32 0 0 2 0 0 0 0 Online 32 1 1 2 0 0 0 0 Online 32 2 2 2 234 0 0 0 Online 32 3 3 2 0 0 0 0 Online Virtual Drive Information: VD DRV RLP RLS RLQ STS SIZE STATE NAME 0 2 1 3 0 64kB 1143552MB Optimal SVN BBU Information: TYPE TEMP OK RSOC ASOC RC CC ME BBU 20C Yes 92% 77% 1377mAh 7 2% I'm not sure how I should interpret these errors and what action, if any, I should take. Do I need to replace - or at least rebuild - the offending drive? Can I do that safely without taking the machine down? The box is covered by Dell support but I'd like to get all my facts straight before I call them and they try to pin a hardware problem on this "FreeBSD" thing they've never heard of before... Many thanks, Scott _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"