New server build with Solaris-10 u5/08,
on a SunFire t5220, and this is our first rollout of ZFS and Zpools.
Have 8 disks, boot disk is hardware mirrored (c1t0d0 + c1t1d0)
Created Zpool my_pool as RaidZ using 5 disks + 1 spare:
c1t2d0, c1t3d0, c1t4d0, c1t5d0, c1t6d0, and spare c1t7d0
I am working on alerting & recovery plans for disks failures in the zpool.
As a test, I have pulled disk c1t6d0, to see what a disk failure will look like.
"zpool status -v mypool" Still reports disk c1t6d0 as ONLINE.
"iostat -En" also does not yet realize that the disk is pulled.
By contrast, format realizes the disk is missing,
and the disk pull did generate errors in /var/adm/messages.
Do I need to hit the device bus with some a command to get a more accurate
status, or something like that?
Would appreciate any recommendations for zpool disk failure monitoring?
See the attachment for output from iostat -En, format, and the tail of
/var/adm/messages:
Here is the output from "zpool status -v":
============================================
newserver:/# zpool status -v
pool: my_pool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
my_pool ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c1t3d0 ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0
c1t6d0 ONLINE 0 0 0
spares
c1t7d0 AVAIL
errors: No known data errors
=============================================
Message was edited by:
cemery
This message posted from opensolaris.org
zpool status my_pool , shows a pulled disk c1t6d0 as ONLINE ???
New server build, on a SunFire t5220,
and this is our first rollout of ZFS and Zpools.
Have 8 disks, boot disk is hardware mirrored (c1t0d0 + c1t1d0)
Created a Raidz (raid5) zpool (called my_pool), including 1 spare,
Created my_pool as RaidZ pool using 5 disks + 1 spare disk:
c1t2d0, c1t3d0, c1t4d0, c1t5d0, c1t6d0, and spare c1t7d0
I am working on alerting & recovery plans for disks failures in the zpool.
As a test, I have pulled disk c1t6d0, to see what a disk failure will look like.
"zpool status -v mypool" Still reports disk c1t6d0 as ONLINE.
"iostat -En" also does not yet realize that the disk is pulled.
By contrast, format realizes the disk is missing,
and the disk pull did generate errors in /var/adm/messages.
Do I need to hit the device bus with some a command to get a more accurate
status? Would appreciate any recommendations for zpool disk failure monitoring?
Below are the output from "zpool status -v", "iostat -En", "format", and the
tail of /var/adm/messages:
============================================
newserver:/# zpool status -v
pool: my_pool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
my_pool ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c1t3d0 ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0
c1t6d0 ONLINE 0 0 0
spares
c1t7d0 AVAIL
errors: No known data errors
=============================================
newserver:/# iostat -En
c1t0d0 Soft Errors: 2 Hard Errors: 0 Transport Errors: 0
Vendor: LSILOGIC Product: Logical Volume Revision: 3000 Serial No:
Size: 146.56GB <146561286144 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
c1t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 0811953XZG
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t0d0 Soft Errors: 4 Hard Errors: 3 Transport Errors: 0
Vendor: TSSTcorp Product: CD/DVDW TS-T632A Revision: SR03 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 3 No Device: 0 Recoverable: 0
Illegal Request: 4 Predictive Failure Analysis: 0
c1t3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 08139591NN
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t4d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 0813957V3R
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t5d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 0813957V2J
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t6d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 0812951XFK
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t7d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE Product: ST914602SSUN146G Revision: 0603 Serial No: 0813957V4B
Size: 146.81GB <146810536448 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
=============================================
newserver:/# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
...
5. c1t6d0 <drive not available>
/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED]/[EMAIL PROTECTED],0
...
Specify disk (enter its number): 5
AVAILABLE DRIVE TYPES:
0. Auto configure
1. Quantum ProDrive 80S
2. Quantum ProDrive 105S
...
20. other
Specify disk type (enter its number): ^C
=============================================
newserver:/# tail -20 /var/adm/messages
Jul 9 14:49:26 sbknwsxapd3 scsi: [ID 107833 kern.notice] Device is gone
Jul 9 14:49:26 sbknwsxapd3 last message repeated 1 time
Jul 9 14:49:26 sbknwsxapd3 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0 (sd6):
Jul 9 14:49:26 sbknwsxapd3 offline or reservation conflict
Jul 9 14:51:01 sbknwsxapd3 scsi: [ID 107833 kern.notice] Device is gone
Jul 9 14:51:01 sbknwsxapd3 last message repeated 1 time
Jul 9 14:51:01 sbknwsxapd3 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0 (sd6):
Jul 9 14:51:01 sbknwsxapd3 offline or reservation conflict
Jul 9 14:51:01 sbknwsxapd3 scsi: [ID 107833 kern.notice] Device is gone
Jul 9 14:51:01 sbknwsxapd3 last message repeated 1 time
Jul 9 14:51:01 sbknwsxapd3 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0 (sd6):
Jul 9 14:51:01 sbknwsxapd3 offline or reservation conflict
Jul 9 14:51:12 sbknwsxapd3 scsi: [ID 107833 kern.notice] Device is gone
Jul 9 14:51:12 sbknwsxapd3 last message repeated 2 times
Jul 9 14:51:12 sbknwsxapd3 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0 (sd6):
Jul 9 14:51:12 sbknwsxapd3 offline or reservation conflict
Jul 9 14:51:14 sbknwsxapd3 scsi: [ID 107833 kern.notice] Device is gone
Jul 9 14:51:14 sbknwsxapd3 last message repeated 1 time
Jul 9 14:51:14 sbknwsxapd3 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL
PROTECTED],0 (sd6):
Jul 9 14:51:14 sbknwsxapd3 offline or reservation conflict
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss