Looks like I've hit this bug:
http://bugs.opensolaris.org/view_bug.do?bug_id=6782540  However, none of the
workaround listed in that bug, or any of the related bugs, works.  :(

Going through the zfs-discuss and freebsd-fs archives, I see that others
have run into this issue, and managed to solve it via "zpool detach/zpool
online/zpool replace".  However, looking closely at the archived messages,
all the successful tests had one thing in common:  1 drive ONLINE, 1 drive
FAULTED.  If a drive is online, obviously it can be detached.  In all the
cases where people have been unsuccessful at fixing this situation, 1 drive
is OFFLINE, and 1 drive is FAULTED.  As is our case:

[fc...@thehive  ~]$ zpool status -v
  pool: storage
 state: DEGRADED
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scrub: none requested
config:

        NAME                    STATE     READ WRITE CKSUM
        storage                 DEGRADED     0     0     0
          raidz2                DEGRADED     0     0     0
            label/disk01        ONLINE       0     0     0
            label/disk02        ONLINE       0     0     0
            label/disk03        ONLINE       0     0     0
            replacing           UNAVAIL      0   534     0  insufficient
replicas
              label/disk04/old  OFFLINE      0   544     0
              label/disk04      FAULTED      0   544     0  corrupted data
            label/disk13        ONLINE       0     0     0
            label/disk14        ONLINE       0     0     0
            label/disk15        ONLINE       0     0     0
            label/disk16        ONLINE       0     0     0
          raidz2                ONLINE       0     0     0
            label/disk05        ONLINE       0     0     0
            label/disk06        ONLINE       0     0     0
            label/disk07        ONLINE       0     0     0
            label/disk08        ONLINE       0     0     0
            label/disk17        ONLINE       0     0     0
            label/disk18        ONLINE       0     0     0
            label/disk19        ONLINE       0     0     0
            label/disk20        ONLINE       0     0     0
          raidz2                ONLINE       0     0     0
            label/disk09        ONLINE       0     0     0
            label/disk10        ONLINE       0     0     0
            label/disk11        ONLINE       0     0     0
            label/disk12        ONLINE       0     0     0
            label/disk21        ONLINE       0     0     0
            label/disk22        ONLINE       0     0     0
            label/disk23        ONLINE       0     0     0
            label/disk24        ONLINE       0     0     0
        cache
          label/cache           ONLINE       0     0     0

errors: No known data errors

[fc...@thehive  ~]$ sudo zpool replace storage label/disk04
cannot replace label/disk04 with label/disk04: cannot replace a replacing
device


Note the OFFLINE status for label/disk04/old.  I cannot get either drive to
detach, or to replace, or to online, or to offline.  "zpool online" on the
old device changes the status to UNAVAIL.  "zpool detach" and "zpool
offline" give the same error:  no valid replicas.

I've tried removing the underlying device, booting with the drive in the
system and without the drive in the system, all kinds of zpool commands, all
without success.

Is there any way to recover from this error?  Or am I doomed to destroy a 10
TB pool?

FreeBSD thehive.sd73.bc.ca 8.0-STABLE FreeBSD 8.0-STABLE #3: Fri Jan 15
11:08:47 PST 2010     r...@thehive.sd73.bc.ca:/usr/obj/usr/src-8/sys/ZFSHOST
 amd64

ZFSv13 (ZFSv14 is available)

-- 
Freddie Cash
fjwc...@gmail.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to