Hi everyone,

I've had some time to upgrade the machine in question to nv-b77 and run
the same tests. And I'm happy to report that now, hotspares work a lot
better. The only question remaining for us: how long for these changes
to be integrated into a supported Solaris release?

See below for some logs.

# zpool history data
History for 'data':
2007-11-22.14:48:18 zpool create -f data raidz2 c4t0d0 c4t1d0 c4t2d0
c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t8d0 c4t9d0 c4t10d0 spare c4t11d0 c4t12d0

>From /var/adm/messages:
Nov 22 15:15:52 ddd scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16):
Error for Command: write(10)               Error Level: Fatal
Requested Block: 103870006                 Error Block: 103870006
Vendor: transtec                           Serial Number:
Sense Key: Not_Ready
ASC: 0x4 (LUN not ready intervention required), ASCQ: 0x3, FRU: 0x0
(and about 27 more of these, until 15:16:02)

Nov 22 15:16:12 ddd scsi: [ID 107833 kern.warning] WARNING:
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16): offline or
reservation conflict
(95 of these, until 15:43:49, almost half an hour later)

And then the console showed "The device has been offlined and marked as
faulted. An attemt will be made to activate a hotspare if available"

And my current zpool status shows:
# zpool status
  pool: data
 state: DEGRADED
status: One or more devices are faulted in response to persistent
        errors. Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the
        device  repaired.
 scrub: resilver completed with 0 errors on Thu Nov 22 16:09:49 2007
config:

        NAME           STATE     READ WRITE CKSUM
        data           DEGRADED     0     0     0
          raidz2       DEGRADED     0     0     0
            c4t0d0     ONLINE       0     0     0
            c4t1d0     ONLINE       0     0     0
            spare      DEGRADED     0     0     0
              c4t2d0   FAULTED      0 23.7K     0  too many errors
              c4t11d0  ONLINE       0     0     0
            c4t3d0     ONLINE       0     0     0
            c4t4d0     ONLINE       0     0     0
            c4t5d0     ONLINE       0     0     0
            c4t6d0     ONLINE       0     0     0
            c4t8d0     ONLINE       0     0     0
            c4t9d0     ONLINE       0     0     0
            c4t10d0    ONLINE       0     0     0
        spares
          c4t11d0      INUSE     currently in use
          c4t12d0      AVAIL

One remark: I find the overview above a bit confusing ('spare'
apparently is 'DEGRADED' and consists of C4t2d0 and c4t11d0) but the
hotspare was properly activated this time and my pool is otherwise in
good health.

Thanks everyone for the replies and suggestions,

Regards, Paul Boven.
-- 
Paul Boven <[EMAIL PROTECTED]> +31 (0)521-596547
Unix/Linux/Networking specialist
Joint Institute for VLBI in Europe - www.jive.nl
VLBI - It's a fringe science
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to