Hi,

Thanks for the quick reply. Now that you have mentioned , we have a
different issue. What is the advantage of using spare disks instead of
including them in the raid-z array? If the system pool is on mirrored disks,
I think that this would be enough (hopefully).  When one disk fails, isn't
it better to have a spare disk on hold, instead of one more disk in the
raid-z and no spares(or just a few)? or, rephrased, is it safer and faster
to replace a disk in a raid-z3 and restore the data from the other disks, or
to have a raid-z2 with a spare disk?

Thank you,

On Mon, Nov 29, 2010 at 6:03 AM, Erik Trimble <erik.trim...@oracle.com>wrote:

> On 11/28/2010 1:51 PM, Paul Piscuc wrote:
>
>> Hi,
>>
>> We are a company that want to replace our current  storage layout with one
>> that uses ZFS. We have been testing it for a month now, and everything looks
>> promising. One element that we cannot determine is the optimum number of
>> disks in a raid-z pool. In the ZFS best practice guide, 7,9 and 11 disks are
>> recommended to be used in a single raid-z2.  On the other hand, another user
>> specifies that the most important part is the distribution of the defaul
>> 128k record size to all the disks. So, the recommended layout would be:
>>
>> 4-disk RAID-Z2 = 128KiB / 2 = 64KiB = good
>> 5-disk RAID-Z2 = 128KiB / 3 = ~43KiB = not good
>> 6-disk RAID-Z2 = 128KiB / 4 = 32KiB = good
>> 10-disk RAID-Z2 = 128KiB / 8 = 16KiB = good
>>
>> What is your recommendations regarding the number of disks? We are
>> planning to use 2 raid-z2 pools with 8+2 disks, 2 spare, 2 SSDs for L2ARC, 2
>> SSDs for ZIL, 2 for syspool, and a similar machine for replication.
>>
>> Thanks in advance,
>>
>>
> You've hit on one of the hardest parts of using ZFS - optimization.   Truth
> of the matter is that there is NO one-size-fits-all "best" solution. It
> heavily depends on your workload type - access patterns, write patterns,
> type of I/O, and size of average I/O request.
>
> A couple of things here:
>
> (1) Unless you are using Zvols for "raw" disk partitions (for use with
> something like a database), the recordsize value is a MAXIMUM value, NOT an
> absolute value.  Thus, if you have a ZFS filesystem with a record size of
> 128k, it will break up I/O into 128k chunks for writing, but it will also
> write smaller chunks.  I forget what the minimum size is (512b or 1k, IIRC),
> but what ZFS does is use a Variable block size, up to the maximum size
> specified in the "recordsize" property.   So, if recordsize=128k and you
> have a 190k write I/O op, it will write a 128k chunk, and a 64k chunk (64
> being the smallest multiple of 2 greater than the remaining 62 bits of
> info).  It WON'T write two 128k chunks.
>
> (2) #1 comes up a bit when you have a mix of file sizes - for instance,
> home directories, where you have lots of small files (initialization files,
> source code, etc.) combined with some much larger files (images, mp3s,
> executable binaries, etc.).  Thus, such a filesystem will have a wide
> variety of chunk sizes, which makes optimization difficult, to say the
> least.
>
> (3) For *random* I/O, a raidZ of any number of disks performs roughly like
> a *single* disk in terms of IOPs and a little better than a single disk in
> terms of throughput.  So, if you have considerable amounts of random I/O,
> you should really either use small raidz configs (no more than 4 data
> disks), or switch to mirrors instead.
>
> (4) For *sequential* or large-size I/O, a raidZ performs roughly equivalent
> to a stripe of the same number of data disks. That is, a N-disk raidz2 will
> perform about the same as a (N-2) disk stripe in terms of throughput and
> IOPS.
>
> (5) As I mentioned in #1, *all* ZFS I/O is broken up into
> powers-of-two-sized chunks, even if the last chunk must have some padding in
> it to get to a power-of-two.   This has implications as to the best number
> of disks in a raidZ(n).
>
>
> I'd have to re-look at the ZFS Best Practices Guide, but I'm pretty sure
> the recommendation of 7, 9, or 11 disks was for a raidz1, NOT a raidz2.  Due
> to #5 above, best performance comes with an EVEN number of data disks in any
> raidZ, so a write to any disks is always a full portion of the chunk, rather
> than a partial one (that sounds funny, but trust me).  The best balance of
> size, IOPs, and throughput is found in the mid-size raidZ(n) configs, where
> there are 4, 6 or 8 data disks.
>
>
> Honestly, even with you describing a workload, it will be hard for us to
> give you a real exact answer. My best suggestion is to do some testing with
> raidZ(n) of different sizes, to see the tradeoffs between size and
> performance.
>
>
> Also, in your sample config, unless you plan to use the spare disks for
> redundancy on the boot mirror, it would be better to configure 2 x 11-disk
> raidZ3 than 2 x 10-disk raidZ2 + 2 spares. Better reliability.
>
>
> --
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
>
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to