Hi,

I recently noticed that there are a lot of Hard Errors on multiple drives 
that's being reported by iostat. Also, dmesg reports various messages from the 
mpt driver.

My config is:
MB: SUPERMICRO X8SIL-F
HBA: AOC-USAS-L8i (LSI 1068)
RAM: 4GB ECC
SunOS SAN 5.11 snv_134 i86pc i386 i86pc Solaris

My configuration is a striped mirrored vdev of 13 drives (one mirror had an 
error on a drive, which I cleared. But just to be safe I added another drive to 
the mirror):

 NAME         STATE     READ WRITE CKSUM
        zpool        ONLINE       0     0     0
          mirror-0   ONLINE       0     0     0
            c4t13d0  ONLINE       0     0     0
            c4t19d0  ONLINE       0     0     0
          mirror-1   ONLINE       0     0     0
            c4t25d0  ONLINE       0     0     0
            c4t31d0  ONLINE       0     0     0
          mirror-2   ONLINE       0     0     0
            c4t12d0  ONLINE       0     0     0
            c4t18d0  ONLINE       0     0     0
          mirror-3   ONLINE       0     0     0
            c4t24d0  ONLINE       0     0     0
            c4t30d0  ONLINE       0     0     0
          mirror-4   ONLINE       0     0     0
            c4t11d0  ONLINE       0     0     0
            c4t17d0  ONLINE       0     0     0
            c4t10d0  ONLINE       0     0     0
          mirror-5   ONLINE       0     0     0
            c4t23d0  ONLINE       0     0     0
            c4t29d0  ONLINE       0     0     0


Here's the output from iostat -En:

c6d1             Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: WDC WD3200BEKT- Revision:  Serial No:      WD-WXR1A30 Size: 320.07GB 
<320070352896 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c7d1             Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: WDC WD3200BEKT- Revision:  Serial No:      WD-WXR1A30 Size: 320.07GB 
<320070352896 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c4t12d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0003 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t13d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t18d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0003 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t19d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t24d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0003 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t25d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t30d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0003 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t31d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD203WI  Revision: 0002 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t17d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD20EADS-32S Revision: 0A01 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t11d0          Soft Errors: 0 Hard Errors: 17 Transport Errors: 116
Vendor: ATA      Product: WDC WD20EADS-32S Revision: 5G04 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 17 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t23d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: ST31500341AS     Revision: CC1H Serial No:
Size: 1500.30GB <1500301910016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t29d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: ST31500341AS     Revision: CC1H Serial No:
Size: 1500.30GB <1500301910016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c4t10d0          Soft Errors: 0 Hard Errors: 252 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD204UI  Revision: 0001 Serial No:
Size: 2000.40GB <2000398934016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

And a sample from dmesg:

Jan  1 10:26:28 SAN     Log info 0x31123000 received for target 11.
Jan  1 10:26:28 SAN     scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
Jan  1 10:26:28 SAN scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,d...@3/pci15d9,a...@0 (mpt0):
Jan  1 10:26:28 SAN     Log info 0x31123000 received for target 11.
Jan  1 10:26:28 SAN     scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc
Jan  1 10:26:28 SAN scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,d...@3/pci15d9,a...@0 (mpt0):
Jan  1 10:26:28 SAN     Log info 0x31123000 received for target 11.
Jan  1 10:26:28 SAN     scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc


What do they mean? It can't be that most of my SAMSUNG drives are failing? They 
almost all have the same number of errors, which is weird. Could this be caused 
by the fact that these SAMSUNG drives have 4K sectors? 'zpool status' reports 
no errors, although it did report a checksum error a while back on a drive, 
which I cleared.

Any help greatly appreciated!
Thanks
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to