Brandon,

Yes, this is something that should be possible once we have bp rewrite (the ability to move blocks around). One minor downside to "hot space" would be that it couldn't be shared among multiple pools the way that hot spares can.

Also depending on the pool configuration, hot space may be impractical. For example if you are using wide RAIDZ[-N] stripes. If you have say 4 top-level RAIDZ-2 vdevs each with 10 disks in it, you would have to keep your pool at most 3/4 full to be able to take advantage of hot space. And if you wanted to tolerate any 2 disks failing, the pool could be at most 1/2 full. (Although one could imagine eventually recombining some of the remaining 18 good disks to make another RAIDZ group.)

So I imagine that with this implementation at least (remove faulted top-level vdev), Hot Space would only be practical when using mirroring. That said, once we have (top-level) device removal implemented, you could implement a poor-man's hot space with some simple scripts -- just remove the degraded top-level vdev from the pool.

FYI, I am currently working on bprewrite for device removal.

--matt

Brandon High wrote:
I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space & power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev & zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I'm sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to