[zfs-discuss] Rethinking my zpool

2010-03-19 Thread Chris Dunbar - Earthside, LLC
Hello,

After being immersed in this list and other ZFS sites for the past few weeks I 
am having some doubts about the zpool layout on my new server. It's not too 
late to make a change so I thought I would ask for comments. My current plan to 
to have 12 x 1.5 TB disks in a what I would normally call a RAID 10 
configuration. That doesn't seem to be the right term here, but there are 6 
sets of mirrored disks striped together. I know that "smaller" sets of disks 
are preferred, but how small is small? I am wondering if I should break this 
into two sets of 6 disks. I do have a 13th disk available as a hot spare. Would 
it be available for either pool if I went with two? Finally, would I be better 
off with raidz2 or something else instead of the striped mirrored sets? 
Performance and fault tolerance are my highest priorities.

Thank you,
Chris Dunbar 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Scott Meilicke
You will get much better random IO with mirrors, and better reliability when a 
disk fails with raidz2. Six sets of mirrors are fine for a pool. From what I 
have read, a hot spare can be shared across pools. I think the correct term 
would be "load balanced mirrors", vs RAID 10.

What kind of performance do you need? Maybe raidz2 will give you the 
performance you need. Maybe not. Measure the performance of each configuration 
and decide for yourself. I am a big fan of iometer for this type of work.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Brandon High
On Fri, Mar 19, 2010 at 5:32 AM, Chris Dunbar - Earthside, LLC <
cdun...@earthside.net> wrote:

> if I went with two? Finally, would I be better off with raidz2 or something
> else instead of the striped mirrored sets? Performance and fault tolerance
> are my highest priorities.
>

Performance and fault tolerance are somewhat conflicting.

You'll have good fault tolerance and performance using a wide raidz3 stripe,
eg: 12-disk raidz3 with a spare.

You'll have the best fault tolerance using small raidz3 stripes with a
spare, for instance 2 x 6-disk raidz3. This uses 50% of your disks for
redundancy.

You'll have slightly better performance and slightly worse fault tolerance
using raidz2 instead in both cases above. I would not recommend using raidz,
as it will offer almost no real fault tolerance with the size of drives
you're using.

You'll have your best performance and fault tolerance using 3-way mirrors,
but you sacrifice 2/3 of your disks to do it. Actually, I think that raidz3
is higher tolerance still, but the performance difference will be huge.

2-way mirrors is slightly worse for fault tolerance (below raidz2 I believe)
and good performance.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Erik Trimble

Brandon High wrote:
On Fri, Mar 19, 2010 at 5:32 AM, Chris Dunbar - Earthside, LLC 
mailto:cdun...@earthside.net>> wrote:


if I went with two? Finally, would I be better off with raidz2 or
something else instead of the striped mirrored sets? Performance
and fault tolerance are my highest priorities.


Performance and fault tolerance are somewhat conflicting.

You'll have good fault tolerance and performance using a wide raidz3 
stripe, eg: 12-disk raidz3 with a spare.


Actually, except on certain loads (large, streaming write/read), this 
config is going to give pretty poor performance.


You'll have the best fault tolerance using small raidz3 stripes with a 
spare, for instance 2 x 6-disk raidz3. This uses 50% of your disks for 
redundancy.


You'll have slightly better performance and slightly worse fault 
tolerance using raidz2 instead in both cases above. I would not 
recommend using raidz, as it will offer almost no real fault tolerance 
with the size of drives you're using.


Realistically, a 2 x 6-disk raidz2 with a hot spare will provide 
/almost/ the same level of redundancy as 2 x 6-disk raidz3, and about 
30% better performance and space. (he said he had 13 disks)



You'll have your best performance and fault tolerance using 3-way 
mirrors, but you sacrifice 2/3 of your disks to do it. Actually, I 
think that raidz3 is higher tolerance still, but the performance 
difference will be huge.


2-way mirrors is slightly worse for fault tolerance (below raidz2 I 
believe) and good performance.

Yes - see my followup post for percentages of failures.

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Erik Trimble

Chris Dunbar - Earthside, LLC wrote:

Hello,

After being immersed in this list and other ZFS sites for the past few weeks I am having 
some doubts about the zpool layout on my new server. It's not too late to make a change 
so I thought I would ask for comments. My current plan to to have 12 x 1.5 TB disks in a 
what I would normally call a RAID 10 configuration. That doesn't seem to be the right 
term here, but there are 6 sets of mirrored disks striped together. I know that 
"smaller" sets of disks are preferred, but how small is small? I am wondering 
if I should break this into two sets of 6 disks. I do have a 13th disk available as a hot 
spare. Would it be available for either pool if I went with two? Finally, would I be 
better off with raidz2 or something else instead of the striped mirrored sets? 
Performance and fault tolerance are my highest priorities.

Thank you,
Chris Dunbar
There's not much benefit I can see to having two pools if both are using 
the same configuration (i.e all mirrors or all raidz). There are reasons 
to do so, but I don't see that they would be of any real benefit for 
what you describe.  A Hot spare disk can be assigned to multiple pools 
(often referred to as a "global" hot spare)


Preferences for raidz[123] configs is to have 4-6 data disks in the vdev.

Realistically speaking, you have several different (practical) 
configurations possible, in order of general performance:


(a)  6 x 2-way mirrors + 1 pool hot spare -> 9TB usable
(b)  4 x 3-ways mirrors + 1 pool hot spare -> 6TB usable
(c)  1 6-disk raidz + 1 7-disk raidz ->  16.5TB usable
(d)  2 6-disk raidz + 1 pool hot spare -> 15TB usable
(e)  1 6-disk raidz2 + 1 7-disk raidz2 -> 13.5TB usable
(f)   2 6-disk raidz2 + 1 pool hot spare -> 12TB usable
(g)  1 6-disk raidz3 + 1 7-disk raidz3 ->  10.5TB usable
(h)  1 13-disk raidz3 -> 15TB usable

Given the size of your disks, resilvering is likely to have a 
significant time problem in any RAIDZ[123] configuration.   That is, 
unless you are storing (almost exclusively) very large files, resilver 
time is going to be significant, and can potentially be radically higher 
than a mirrored config.


The mirroring configs will out-perform raidz[123] on everything except 
large streaming write/reads, and even then, it's a toss-up. 

Overall, the (a), (d), and (f) configurations generally offer the best 
balance of redundancy, space, and performance.


Here's the chances to survive disk failures (assuming hot spares are 
unable to be used; that is, all disk failures happen in a short period 
of time) - note that all three can always survive a single disk failure:


(a)   90% for 2, 73% for 3, 49% for 4, 25% for 5.
(d)   55% for 2, 27% for 3, 0% for 4 or more
(f)   100% for 2, 80% for 3, 56% for 4, 0% for 5.


Depending on your exact requirements, I'd go with (a) or (f) as the best 
choices - (a) if performance is more important, (f) if redundancy 
overrides performance.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Darren J Moffat
12 disks in mirrored pairs is a small configuration.  The "smaller" sets 
you referrer to might be the number of disks in a raidz/raidz2/raidz3 
top level vdev.


You say performance is one of your top priorities but what is the 
workload ?  Mostly read ? Mostly write ?  Random ? Sequential ?



See the ZFS Best Practices guide on the solarisinternals.com site for 
guidance on how to select your pool layout.


http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

In particular this part:

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pool_Performance_Considerations

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-20 Thread Richard Elling
On Mar 19, 2010, at 5:32 AM, Chris Dunbar - Earthside, LLC wrote:

> Hello,
> 
> After being immersed in this list and other ZFS sites for the past few weeks 
> I am having some doubts about the zpool layout on my new server. It's not too 
> late to make a change so I thought I would ask for comments. My current plan 
> to to have 12 x 1.5 TB disks in a what I would normally call a RAID 10 
> configuration. That doesn't seem to be the right term here, but there are 6 
> sets of mirrored disks striped together. I know that "smaller" sets of disks 
> are preferred, but how small is small? I am wondering if I should break this 
> into two sets of 6 disks. I do have a 13th disk available as a hot spare. 
> Would it be available for either pool if I went with two? Finally, would I be 
> better off with raidz2 or something else instead of the striped mirrored 
> sets? Performance and fault tolerance are my highest priorities.

Do you believe in coincidence? :-)  I recently blogged about the reliability
analysis using 12 disks as a representative sample.  I didn't add a hot
spare for this analysis, but it would help in all cases.
http://blog.richardelling.com/2010/02/zfs-data-protection-comparison.html

For those disinclined to click, data retention when mirroring wins over raidz
when looking at the problem from the perspective of number of drives 
available.  Why? Because 5+1 raidz survives the loss of any disk, but 3 sets
of 2-way mirrors can survive the loss of 3 disks, as long as 2 of those disks 
are not in the same set. The rest is just math.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-20 Thread Brandon High
On Sat, Mar 20, 2010 at 1:35 PM, Richard Elling wrote:

> For those disinclined to click, data retention when mirroring wins over
> raidz
> when looking at the problem from the perspective of number of drives
> available.  Why? Because 5+1 raidz survives the loss of any disk, but 3
> sets
> of 2-way mirrors can survive the loss of 3 disks, as long as 2 of those
> disks
> are not in the same set. The rest is just math.
>

The one dimension left out in your comparison is the portion of space that's
available for use vs. redundancy overhead. I'm sure you just never thought
of it. ;-)

For 12 disks using a 4-way mirror, you'd have 75% overhead but the best
MTTDL. raidz3 is only 25% overhead, but provides a better MTTDL than 3-way
mirrors (at 66% overhead). raidz2 (16% overhead) has better MTTDL than 2-way
mirrors (at 50%).

So clearly, if fault tolerance is the absolute most important factor, a
really big mirror is best. This will also give very good read performance. I
imagine a 12-way mirror would last a while (2.09E+57 years according to
Richard's formula) but it's also at high cost.

I think the only real route to follow is to determine how much space you
need, and then optimize MTTDL and performance around that constraint. If you
determine that you need 10 TB available, then (using 1.5T drives) you need
to use at least 7 disks for data. That means a 12-disk raidz3 (13.5 TB), or
2x 6-disk raidz2 (12 TB). The raidz3 will have higher fault tolerance, but
lower performance.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-21 Thread Richard Elling
On Mar 20, 2010, at 10:12 PM, Brandon High wrote:

> On Sat, Mar 20, 2010 at 1:35 PM, Richard Elling  
> wrote:
> For those disinclined to click, data retention when mirroring wins over raidz
> when looking at the problem from the perspective of number of drives
> available.  Why? Because 5+1 raidz survives the loss of any disk, but 3 sets
> of 2-way mirrors can survive the loss of 3 disks, as long as 2 of those disks
> are not in the same set. The rest is just math.
> 
> The one dimension left out in your comparison is the portion of space that's 
> available for use vs. redundancy overhead. I'm sure you just never thought of 
> it. ;-)

There are two dimensions missing: space and performance.

> For 12 disks using a 4-way mirror, you'd have 75% overhead but the best 
> MTTDL. raidz3 is only 25% overhead, but provides a better MTTDL than 3-way 
> mirrors (at 66% overhead). raidz2 (16% overhead) has better MTTDL than 2-way 
> mirrors (at 50%).

The "all-in" post puts all three on one chart, but in this case it is
for 46 disks, not 12.
http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance1

> So clearly, if fault tolerance is the absolute most important factor, a 
> really big mirror is best. This will also give very good read performance. I 
> imagine a 12-way mirror would last a while (2.09E+57 years according to 
> Richard's formula) but it's also at high cost.
> 
> I think the only real route to follow is to determine how much space you 
> need, and then optimize MTTDL and performance around that constraint. If you 
> determine that you need 10 TB available, then (using 1.5T drives) you need to 
> use at least 7 disks for data. That means a 12-disk raidz3 (13.5 TB), or 2x 
> 6-disk raidz2 (12 TB). The raidz3 will have higher fault tolerance, but lower 
> performance.

Indeed.  Space, performance, dependability: pick two
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-22 Thread Chris Dunbar
Thank you to all who responded. This response in particular was very helpful 
and I think I will stick with my current zpool configuration (choice "a" if 
you're reading below). I primarily host VMware virtual machines over NFS from 
this server's predecessor and this server will be doing the same thing. I think 
the 6 x 2-way mirror configuration gives me the best mix of performance and 
fault tolerance.

Regards,
Chris Dunbar

On Mar 19, 2010, at 5:44 PM, Erik Trimble wrote:

> Chris Dunbar - Earthside, LLC wrote:
> > Hello,
> >
> > After being immersed in this list and other ZFS sites for the past few 
> > weeks I am having some doubts about the zpool layout on my new server. It's 
> > not too late to make a change so I thought I would ask for comments. My 
> > current plan to to have 12 x 1.5 TB disks in a what I would normally call a 
> > RAID 10 configuration. That doesn't seem to be the right term here, but 
> > there are 6 sets of mirrored disks striped together. I know that "smaller" 
> > sets of disks are preferred, but how small is small? I am wondering if I 
> > should break this into two sets of 6 disks. I do have a 13th disk available 
> > as a hot spare. Would it be available for either pool if I went with two? 
> > Finally, would I be better off with raidz2 or something else instead of the 
> > striped mirrored sets? Performance and fault tolerance are my highest 
> > priorities.
> >
> > Thank you,
> > Chris Dunbar
> There's not much benefit I can see to having two pools if both are using 
> the same configuration (i.e all mirrors or all raidz). There are reasons 
> to do so, but I don't see that they would be of any real benefit for 
> what you describe. A Hot spare disk can be assigned to multiple pools 
> (often referred to as a "global" hot spare)
> 
> Preferences for raidz[123] configs is to have 4-6 data disks in the vdev.
> 
> Realistically speaking, you have several different (practical) 
> configurations possible, in order of general performance:
> 
> (a) 6 x 2-way mirrors + 1 pool hot spare -> 9TB usable
> (b) 4 x 3-ways mirrors + 1 pool hot spare -> 6TB usable
> (c) 1 6-disk raidz + 1 7-disk raidz -> 16.5TB usable
> (d) 2 6-disk raidz + 1 pool hot spare -> 15TB usable
> (e) 1 6-disk raidz2 + 1 7-disk raidz2 -> 13.5TB usable
> (f) 2 6-disk raidz2 + 1 pool hot spare -> 12TB usable
> (g) 1 6-disk raidz3 + 1 7-disk raidz3 -> 10.5TB usable
> (h) 1 13-disk raidz3 -> 15TB usable
> 
> Given the size of your disks, resilvering is likely to have a 
> significant time problem in any RAIDZ[123] configuration. That is, 
> unless you are storing (almost exclusively) very large files, resilver 
> time is going to be significant, and can potentially be radically higher 
> than a mirrored config.
> 
> The mirroring configs will out-perform raidz[123] on everything except 
> large streaming write/reads, and even then, it's a toss-up. 
> 
> Overall, the (a), (d), and (f) configurations generally offer the best 
> balance of redundancy, space, and performance.
> 
> Here's the chances to survive disk failures (assuming hot spares are 
> unable to be used; that is, all disk failures happen in a short period 
> of time) - note that all three can always survive a single disk failure:
> 
> (a) 90% for 2, 73% for 3, 49% for 4, 25% for 5.
> (d) 55% for 2, 27% for 3, 0% for 4 or more
> (f) 100% for 2, 80% for 3, 56% for 4, 0% for 5.
> 
> 
> Depending on your exact requirements, I'd go with (a) or (f) as the best 
> choices - (a) if performance is more important, (f) if redundancy 
> overrides performance.
> 
> -- 
> Erik Trimble
> Java System Support
> Mailstop: usca22-123
> Phone: x17195
> Santa Clara, CA
> 
> eSoft SpamFilter Training Tool
> Train as Spam
> Blacklist for All Users
> Whitelist for All Users

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss