Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-14 Thread Ross
Rather bizarrely, after that second failure I pulled the disk, cleared the pool, re-inserted it and forced it online. This time, ZFS resilvered fine with zero errors: # zpool status pool: rc-pool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can

[zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Ross
Just look at this. I thought all the restarting resilver bugs were fixed, but it looks like something odd is still happening at the start: Status immediately after starting resilver: # zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Galen
Ross, I feel you here, but I don't have much of a solution. The best I can suggest (and has been my solution) is to take out the problematic disk, copy it to a fresh disk (preferably using something like dd_rescue) and then re-install. It seems the resilvering loop is generally a result

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Ross Walker
Maybe it's the disks firmware that is bad or maybe they're jumpered for 1.5Gbps on a 3.0 only bus? Or maybe it's a problem with the disk cable/bay/enclosure/slot? It sounds like there is more then ZFS in the mix here. I wonder if the drive's status keeps flapping online/offline and

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Ross
No, I don't think I need to take a disk out. It's running ok now, it just seemed to get a bit confused at the start: $ zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error.

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Ross
Gaaah, looks like I spoke too soon: $ zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Galen
Ross, The disks do have problems - that's why I'm resilvering. I've seen zero read, write or checksum errors and had it loop. Now I do have a number of read errors on some of the disks, but I think resilvering is missing the point if it can't deal with corrupt data or disks with a small

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

2009-07-13 Thread Ross Walker
On Jul 13, 2009, at 11:33 AM, Ross no-re...@opensolaris.org wrote: Gaaah, looks like I spoke too soon: $ zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are