A couple of days after updating to oi_151a5, the first of my two boot drives stopped being able to do I/O and zfs removed it from the pool. I thought that this was interesting since I learned about it after seeing someone post on the list that the first of his two boot drives was removed from the pool not long after updating to oi_151a5. I did a 'zpool status rpool' to see the state of my own pool. My pool was in the same condition as his. Later this same other person posted that they downgraded to oi_151a4 and then the OS could see the drive and do I/O with it.

This evening I replaced the failed drive with a completely different one. The OS is able to query the drive info but is still completely unable to perform I/O on it.

# zpool status rpool
  pool: rpool
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 0h6m with 0 errors on Tue Jul 10 20:45:50 2012
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         DEGRADED     0     0     0
          mirror-0    DEGRADED     0     0     0
            c3t0d0s0  FAULTED      0     0     0  too many errors
            c3t1d0s0  ONLINE       0     0     0

errors: No known data errors

# dd if=/dev/rdsk/c3t0d0s0 of=/dev/null bs=64k count=1024
dd: opening `/dev/rdsk/c3t0d0s0': I/O error

AVAILABLE DISK SELECTIONS:
       0. c3t0d0 <ATA-ST1000NM0011-SN02 cyl 60798 alt 2 hd 255 sec 126>
          /pci@0,0/pci15d9,62c@1f,2/disk@0,0
       1. c3t1d0 <ATA-WDCWD5003ABYX-0-1S02 cyl 60798 alt 2 hd 255 sec 63>
          /pci@0,0/pci15d9,62c@1f,2/disk@1,0

Iostat does not show any errors logged against my new drive:

# iostat -xe
extended device statistics ---- errors ---
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b s/w h/w trn tot
sd1       0.2    0.0    3.6    0.0  0.0  0.0    0.1   0   0   0   0   0   0
sd2       3.9    2.4  122.5   23.2  0.0  0.0    6.4   1   1   0   0   0   0

# iostat -E
sd1       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: ST1000NM0011     Revision: SN02 Serial No: Z1N21SQN
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 108 Predictive Failure Analysis: 0
sd2       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD5003ABYX-0 Revision: 1S02 Serial No: 
WD-WMAYP3661514
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 33 Predictive Failure Analysis: 0

# cfgadm -f -c configure sata0/0::dsk/c3t0d0
cfgadm: Library error: Cannot determine sata port number for ap_id: /devices/pci@0,0/pci15d9,62c@1f,2:0::dsk/c3t0d0

The above seems really strange since it sounds like the OS has become confused about the device.

Is there a known kernel configuration or driver issue which might cause the OS to forget how to do I/O with SATA drives, and particularly the first boot drive?

Bob
--
Bob Friesenhahn
[email protected], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to