[zfs-discuss] abusing zfs boot disk for fun and DR

Ben Taylor Thu, 07 Jan 2010 04:52:26 -0800

I'm in the process of standing up a couple of t5440's, of which the
config will eventually end up in another data center 6k miles from
the original config, and I'm supposed to send disks to the data
center and we'll start from there (yes, I know how to flar and jumpstart. 
When the boss says do something, sometimes you *just* have to do it)


As I've already run into the boot failsafe when moving a root disk from
one sparc host to another, I recently found out that a sys-unconfig'd disk 
does  not suffer from the same problem.

While I am probably going to be told, I shouldn't be doing this,
I ran into an interesting "semantics" issue that I think zfs should
at least be able to avoid (and which I have seen in other non-abusive
configurations..... ;-)

2 zfs disk, root mirrored. c2t0 and c2t1.

hot unplug c2t0, (and I should probably have removed the
busted mirror from c2t1, but I didn't)

sys-unconfig disk in c2t1

move disk to new t5440

boot disk, and it enumerates everything correctly and then I notice
zpool thinks it's degraded.  I had added the mirror after I realized
I wanted to run this by the list....

  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: resilver completed after 0h7m with 0 errors on Thu Jan  7 12:10:03 2010
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         DEGRADED     0     0     0
          mirror      DEGRADED     0     0     0
            c2t0d0s0  ONLINE       0     0     0
            c2t0d0s0  FAULTED      0     0     0  corrupted data
            c2t3d0s0  ONLINE       0     0     0  13.8G resilvered


Anyway, should zfs report a faulted drive of the same ctd# which is
already active?  I understand why this happened, but from a logistics
perspective, shouldn't zfs be smart enough to ignore a faulted disk
like this?  And this is not the first time I've had this scenario happen
(I had an x4500 that had suffered through months of marvell driver
bugs and corruption, and we probably had 2 or 3 of these types of
things happen while trying to "soft" fix the problems).  This also
happened with hot-spares, which caused support to spend some time
with back-line to figure out a procedure to clear those fauled disks
which had the same ctd# as a working hot-spare.......

Ben
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] abusing zfs boot disk for fun and DR

Reply via email to