Re: [zfs-discuss] ZFS - how to determine which physical drive to replace
On Sat, Dec 12, 2009 at 9:58 AM, Edward Ned Harvey wrote: > I would suggest something like this: While the system is still on, if the > failed drive is at least writable *a little bit* … then you can “dd > if=/dev/zero of=/dev/rdsk/FailedDiskDevice bs=1024 count=1024” … and then > after the system is off, you could plug the drives into another system > one-by-one, and read the first 1M, and see if it’s all zeros. (Or instead > of dd zero, you could echo some text onto the drive, or whatever you think > is easiest.) > How about reading instead? dd if=/dev/rdsk/$whatever of=/dev/null If the failed disk generates I/O errors that prevent it from reading at a rate that causes an LED to blink, you could read from all of the good disks. The one that doesn't blink is the broken one. You can also get the drive serial number with iostat -En: $ iostat -En c3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Model: Hitachi HTS5425 Revision: Serial No: 080804BB6300HCG Size: 160.04GB <160039305216 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 ... That /should/ be printed on the disk somewhere. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS - how to determine which physical drive to replace
I've found that when I build a system, it's worth the initial effort to install drives one by one to see how they get mapped to names. Then I put labels on the drives and SATA cables. If there were room to label the actual SATA ports on the motherboard and cards, I would. While this isn't foolproof, it gives me a bit more reassurance in the [inevitable] event of a drive failure. On Sat, Dec 12, 2009 at 9:17 AM, Paul Bruce wrote: > Hi, > I'm just about to build a ZFS system as a home file server in raidz, but I > have one question - pre-empting the need to replace one of the drives if it > ever fails. > How on earth do you determine the actual physical drive that has failed ? > I've got the while zpool status thing worked out, but how do I translate > the c1t0d0, c1t0d1 etc.. to a real physical driver. > I can just see myself looking at the 6 drives, and thinking ". > c1t0d1 i think that's *this* one".. einee menee minee moe > P > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS - how to determine which physical drive to replace
On Sat, Dec 12, 2009 at 8:17 AM, Paul Bruce wrote: > Hi, > I'm just about to build a ZFS system as a home file server in raidz, but I > have one question - pre-empting the need to replace one of the drives if it > ever fails. > How on earth do you determine the actual physical drive that has failed ? > I've got the while zpool status thing worked out, but how do I translate > the c1t0d0, c1t0d1 etc.. to a real physical driver. > I can just see myself looking at the 6 drives, and thinking ". > c1t0d1 i think that's *this* one".. einee menee minee moe > P As suggested at http://opensolaris.org/jive/thread.jspa?messageID=416264, you can try viewing the disk serial numbers with cfgadm: cfgadm -al -s "select=type(disk),cols=ap_id:info" You may need to power down the system to view the serial numbers printed on the disks to match them up, but it beats guessing. Ed Plese ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS - how to determine which physical drive to replace
This is especially important, because if you have 1 failed drive, and you pull the wrong drive, now you have 2 failed drives. And that could destroy the dataset (depending on whether you have raidz-1 or raidz-2) Whenever possible, always get the hotswappable hardware, that will blink a red light for you, so there can be no mistake. Even if the hardware doesn't blink a light for you, you could manually cycle between activity and non-activity on the disks, to identify the disk yourself . But if that's not a possibility . if you have no lights on non-hotswappable disks . then . Given you're going to have to power off the system. Given it's difficult to map the device name to physical wire. I would suggest something like this: While the system is still on, if the failed drive is at least writable *a little bit* . then you can "dd if=/dev/zero of=/dev/rdsk/FailedDiskDevice bs=1024 count=1024" . and then after the system is off, you could plug the drives into another system one-by-one, and read the first 1M, and see if it's all zeros. (Or instead of dd zero, you could echo some text onto the drive, or whatever you think is easiest.) Obviously that's not necessarily an option. If the drive is completely dead, totally unwritable, then when you plug the drives one-by-one into another system, it should be easy to identify the failed drive. From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Paul Bruce Sent: Saturday, December 12, 2009 9:18 AM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] ZFS - how to determine which physical drive to replace Hi, I'm just about to build a ZFS system as a home file server in raidz, but I have one question - pre-empting the need to replace one of the drives if it ever fails. How on earth do you determine the actual physical drive that has failed ? I've got the while zpool status thing worked out, but how do I translate the c1t0d0, c1t0d1 etc.. to a real physical driver. I can just see myself looking at the 6 drives, and thinking ". c1t0d1 i think that's *this* one".. einee menee minee moe P ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS - how to determine which physical drive to replace
Hi, I'm just about to build a ZFS system as a home file server in raidz, but I have one question - pre-empting the need to replace one of the drives if it ever fails. How on earth do you determine the actual physical drive that has failed ? I've got the while zpool status thing worked out, but how do I translate the c1t0d0, c1t0d1 etc.. to a real physical driver. I can just see myself looking at the 6 drives, and thinking ". c1t0d1 i think that's *this* one".. einee menee minee moe P ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss