Re: [zfs-discuss] Speeding up resilver on x4500

Erik Trimble Tue, 23 Jun 2009 10:58:57 -0700

Richard Elling wrote:

Erik Trimble wrote:
All this discussion hasn't answered one thing for me: exactly _how_does ZFS do resilvering? Both in the case of mirrors, and of RAIDZ[2] ?
I've seen some mention that it goes in cronological order (which tome, means that the metadata must be read first) of file creation, andthat only used blocks are rebuilt, but exactly what is themethodology being used?
See Jeff Bonwick's blog on the topic
http://blogs.sun.com/bonwick/entry/smokin_mirrors
-- richard


That's very informative. Thanks, Richard.

So, ZFS walks the used block tree to see what still needs rebuilding.I guess I have two related questions then:

(1) Are these blocks some fixed size (based on the media - usually 512bytes), or are they "ZFS blocks" - the fungible size based on therequirements of the original file size being written?(2) is there some reasonable way to read in multiples of these blocks ina single IOP? Theoretically, if the blocks are in chronologicalcreation order, they should be (relatively) sequential on the drive(s).Thus, ZFS should be able to read in several of them without forcing arandom seek. That is, you should be able to get multiple blocks in asingle IOP.

If we can't get multiple ZFS blocks in one sequential read, we'rescrewed - ZFS is going to be IOPS bound on the replacement disk, with noreal workaround. Which means rebuild times for disks with lots of smallfiles is going to be hideous.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Speeding up resilver on x4500

Reply via email to