Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-18 Thread Brandon High
On Thu, Sep 17, 2009 at 11:41 AM, Adam Leventhal a...@eng.sun.com wrote:
  RAID-3        bit-interleaved parity (basically not used)

There was a hardware RAID chipset that used RAID-3. Netcell Revolution
I think it was called.

It looked interesting and I thought about grabbing one at the time but
never got around to it. Netcell is defunct or got bought out, so the
controller is no longer available.

-B

-- 
Brandon High : bh...@freaks.com
Always try to do things in chronological order; it's less confusing that way.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-18 Thread Bill Sommerfeld

On Wed, 2009-09-16 at 14:19 -0700, Richard Elling wrote:
 Actually, I had a ton of data on resilvering which shows mirrors and
 raidz equivalently bottlenecked on the media write bandwidth. However,
 there are other cases which are IOPS bound (or CR bound :-) which
 cover some of the postings here. I think Sommerfeld has some other
 data which could be pertinent.

I'm not sure I have data, but I have anecdotes and observations, and a
few large production pools used for solaris development by me and my
coworkers.

the biggest one (by disk count) takes 80-100 hours to scrub and/or
resilver.

my working hypothesis is that resilver of pools which:
 1) have a lot of files, directories, filesystems, and periodic
snapshots
 2) have atime updates enabled (default config)
 3) have regular (daily) jobs doing large-scale filesystem tree-walks

wind up rewriting most blocks of the dnode files on every tree walk
doing atime updates, and as a result the dnode file (but not most of the
blocks it points to) differs greatly from daily snapshot to daily
snapshot.

as a result, scrub/resilver traversals end up spending most of their 
time doing random reads of the dnode files of each snapshot.

here are some bugs that, if fixed, might help:

6678033 resilver code should prefetch
6730737 investigate colocating directory dnodes

- Bill

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Eugen Leitl
On Wed, Sep 16, 2009 at 10:23:01AM -0700, Richard Elling wrote:

 This line of reasoning doesn't get you very far.  It is much better to  
 take a look at
 the mean time to data loss (MTTDL) for the various configurations.  I  
 wrote a
 series of blogs to show how this is done.
 http://blogs.sun.com/relling/tags/mttdl

Excellent information, thanks! I presume MTTDL[1] years and
MTTDL[2] is the same as in 
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl 

Do you think it would be possible to publish same information
for 24 drives (not all of us can buy a Thumper), and maybe 
include raidz3 into the number crunch?

Thanks!

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Tomas Ögren
On 17 September, 2009 - Eugen Leitl sent me these 2,0K bytes:

 On Wed, Sep 16, 2009 at 08:02:35PM +0300, Markus Kovero wrote:
 
  It's possible to do 3-way (or more) mirrors too, so you may achieve better 
  redundancy than raidz2/3
 
 I understand there's almost no additional performance penalty to raidz3 
 over raidz2 in terms of CPU load. Is that correct?
 
 So SSDs for ZIL/L2ARC don't bring that much when used with raidz2/raidz3,
 if I write a lot, at least, and don't access the cache very much, according
 to some recent posts on this list.
 
 How much drive space am I'm losing with mirrored pools versus raidz3? IIRC
 in RAID 10 it's only 10% over RAID 6, which is why I went for RAID 10 in
 my 14-drive SATA (WD RE4) setup.

It's not a fixed value per technology, it depends on the number of disks
per group. RAID5/RAIDZ1 loses 1 disk worth to parity per group.
RAID6/RAIDZ loses 2 disks. RAIDZ3 loses 3 disks. Raid1/mirror loses
half the disks. So in your 14 drive case, if you go for one big
raid6/raidz2 setup (which is larger than recommended for performance
reasons), you will lose 2 disks worth of storage to parity leaving 12
disks worth of data. With raid10 you will lose half, 7 disks to
parity/redundancy. With two raidz2 sets, you will get (5+2)+(5+2), that
is 5+5 disks worth of storage and 2+2 disks worth of redundancy. The
actual redudancy/parity is spread over all disks, not like raid3 which
has a dedicated parity disk.

For more info, see for example http://en.wikipedia.org/wiki/RAID

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Erik Trimble

Eugen Leitl wrote:

On Wed, Sep 16, 2009 at 08:02:35PM +0300, Markus Kovero wrote:

  

It's possible to do 3-way (or more) mirrors too, so you may achieve better 
redundancy than raidz2/3



I understand there's almost no additional performance penalty to raidz3 
over raidz2 in terms of CPU load. Is that correct?
  
As far as I understand the z3 algorithms, the performance penalty is 
very slightly higher than z2. I think it's reasonable to treat z1, z2, 
and z3 as equal in terms of CPU load.



So SSDs for ZIL/L2ARC don't bring that much when used with raidz2/raidz3,
if I write a lot, at least, and don't access the cache very much, according
to some recent posts on this list.
  

Not true.

Remember:  ZIL = write cache
  L2ARC = read cache

So, if you have a write-heavy workload which seldom does much more than 
large reads, an L2ARC SSD doesn't make much sense. Main RAM should 
suffice for storing the read cache.


Random reads aren't fast on RAIDZ, so a read cache is a good thing if 
you are doing that kind of I/O.   Similarly, random writes (particularly 
small random writes) are suck hard on RADIZ, so a write cache is a 
fabulous idea there.


If you are doing very large sequential writes to a RAIDZ (any sort) 
pool, then a write cache will likely be much less helpful.  But 
remember, very large means that you frequently exceed the size of the 
SSD you've allocated for the ZIL.   I'd have to run the numbers, but you 
should still see a major performance improvement by using a SSD for ZIL, 
up to the point where your typical write load exceeds 10% of the size of 
SSD.  Naturally, write-heavy workloads will be murder on a MLC or hybrid 
SSD's life expectancy, though, a large sequential-write-heavy load will 
allow the SSD to perform better and longer than a small random write load.


A write SSD will help you up until you try to write to the SSD faster 
than it can flush out it's contents to actual disk. So, you need to take 
into consideration exactly how much data is coming in, and the write 
speed of your (non-SSD) disks. If you are continuously (and constantly) 
exceeding the speed of your disks with incoming data, then SSDs won't 
really help. You'll see some help up until the SSD fills up, then 
performance will drop to equal that as if the SSD didn't exist. 

Doing [very] rough calculations, let's say your SSD has a read/write 
throughput of 200MB/s, and is 100GB in size. If your hard drives can 
only do 50MB/s,   then you can write up to 150MB/s to the SSD, read 
50MB/s  from the SSD, and write 50MB/s to the disks.  This means, each 
second, you fill the SSD with 100MB more data that can't be flushed out 
fast enough.  At 100MB/s, it takes 1,000 seconds to fill 100GB. So, in 
about 17 minutes, you've completely filled the SSD, and performance 
drops like a rock.   There is a similar cliff problem around IOPS.



How much drive space am I'm losing with mirrored pools versus raidz3? IIRC
in RAID 10 it's only 10% over RAID 6, which is why I went for RAID 10 in
my 14-drive SATA (WD RE4) setup.
  
Basic math says for N disks, you get N-3 amount of space for a RAIDZ3, 
and N/2 for a 2-way mirror.  N-3  N/2 for all N = 6 or more.   But, 
remember, you'll generally need at least one hot spare for a mirror, so 
really, the equations looks like this:


N-3  (N/2) -1 which means, RAIDZ3 gives you more space for N  4


Let's assume I want to fill a 24-drive Supermicro chassis with 1 TByte
WD Caviar Black or 2 TByte RE4 drives, and use 4x X25-M 80 GByte
2nd gen Intel consumer drives, mirrored, each pair as ZIL/L2ARC
for the 24 SATA drives behind them. Let's assume CPU is not an issue,
with dual-socket Nehalems and 24 GByte RAM or more. There are applications
packaged in Solaris containers running on the same box, however.
  
Remember to take a look at Richard's spreadsheet about drive errors and 
the amount of time you can expect to go without serious issue.He's 
also got good stuff about optimizing for speed vs space.


http://blogs.sun.com/relling/


Quick math for a 24-drive setup:

Scenario A:stripe of mirrors, plus global spares.
 11 x 2-way mirror = 11 disks of data, plus 2 
additional hot spares


Scenario B:stripe of raidz3, no global spares
3 x  8-drive RAIDz3  (5 data + 3 parity drives )=  
3 x 5 = 15 data drives, with a total of 9 internal spares
   
Thus, A gives you about 30% less disk space than B.




Let's say the workload is mostly multiple streams (hundreds to thousands
simultaneously, some continuous, some bursty) each writing data 
to the storage system. However, some few clients will be using database-like

queries to read, potentially on the entire data store.

With above workload, is raidz2/raid3 right out, and will I need mirrored
pools? 
  
The database queries will definitely benefit from a L2ARC SSD - the size 
of that SSD depends on exactly how much data the query has to check. 

Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Eugen Leitl
On Thu, Sep 17, 2009 at 12:55:35PM +0200, Tomas Ögren wrote:

 It's not a fixed value per technology, it depends on the number of disks
 per group. RAID5/RAIDZ1 loses 1 disk worth to parity per group.
 RAID6/RAIDZ loses 2 disks. RAIDZ3 loses 3 disks. Raid1/mirror loses
 half the disks. So in your 14 drive case, if you go for one big
 raid6/raidz2 setup (which is larger than recommended for performance

I presume for 24 disks (my next project, the current 16-disk 
one had to be converted to CentOS for software compatibility reasons) 
you would recommend splitting them into two groups, a la 12 disks. 
With raidz3, there would be 9 disks left for data, 18 total -- 
36 TBytes effective in case of 2 TByte WD RE4 drives, half that 
for WD Caviar Black. How many hot spares should I leave in 
each pool, one or more? 

Is it safe to stripe over two such 12-disk pools? 
Or is mirror the right thing to do, regardless of drive costs?

Speaking of which, does anyone use NFSv4 clustering in production
to aggregate individual zfs boxes? Experiences good/bad?

 reasons), you will lose 2 disks worth of storage to parity leaving 12
 disks worth of data. With raid10 you will lose half, 7 disks to
 parity/redundancy. With two raidz2 sets, you will get (5+2)+(5+2), that
 is 5+5 disks worth of storage and 2+2 disks worth of redundancy. The
 actual redudancy/parity is spread over all disks, not like raid3 which
 has a dedicated parity disk.

So raidz3 has a dedicated parity disk? I couldn't see that from
skimming http://blogs.sun.com/ahl/entry/triple_parity_raid_z
 
 For more info, see for example http://en.wikipedia.org/wiki/RAID

Unfortunately, this is very thin on zfs.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
is very helpful, but it doesn't offer concrete layout examples for
odd number of disks (understandable, since Sun has to sell the 
Thumper), and is pretty mum on raidz3.

Thank you. This list is fun, and helpful.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Darren J Moffat

Erik Trimble wrote:

So SSDs for ZIL/L2ARC don't bring that much when used with raidz2/raidz3,
if I write a lot, at least, and don't access the cache very much, 
according

to some recent posts on this list.
  

Not true.

Remember:  ZIL = write cache


ZIL is NOT a write cache.  The ZIL is the Intent Log not a cache.  It is 
used only for synchronous writes.   It is not a cache because the term 
cache implies the data is also somewhere else and you lose nothing but 
potential performance if you loose the cache.


ZFS calls the devices used to hold the ZIL (there is one ZIL per 
dataset) a SLOG (Separate Log device).


Note also the recent addition of the logbias dataset property.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Erik Trimble

Darren J Moffat wrote:

Erik Trimble wrote:
So SSDs for ZIL/L2ARC don't bring that much when used with 
raidz2/raidz3,
if I write a lot, at least, and don't access the cache very much, 
according

to some recent posts on this list.
  

Not true.

Remember:  ZIL = write cache


ZIL is NOT a write cache.  The ZIL is the Intent Log not a cache.  It 
is used only for synchronous writes.   It is not a cache because the 
term cache implies the data is also somewhere else and you lose 
nothing but potential performance if you loose the cache.


ZFS calls the devices used to hold the ZIL (there is one ZIL per 
dataset) a SLOG (Separate Log device).


Note also the recent addition of the logbias dataset property.



I should have more properly used the term buffer, which is what ZIL is 
more closely related to. Sorry about that - I didn't mean to imply that 
the ZIL was the same as something like a STK6140's NVRAM.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Marion Hakanson
rswwal...@gmail.com said:
 It's not the stripes that make a difference, but the number of controllers
 there.
 
 What's the system config on that puppy? 

The zpool status -v output was from a Thumper (X4500), slightly edited,
since in our real-world Thumper, we use c6t0d0 in c5t4d0's place in the
optimal layout I posted, because c5t4d0 is used in the boot-drive mirror.

See the following for our 2006 Thumper benchmarks, which appear to bear
out Richard Elling's RaidOptimizer analysis:
http://acc.ohsu.edu/~hakansom/thumper_bench.html

While I'm at it, filebench numbers from a recent J4400-based database
server deployment, with some slog vs no-slog comparisons (sorry, no
SSD's available here yet):
http://acc.ohsu.edu/~hakansom/j4400_bench.html

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-17 Thread Adam Leventhal
On Thu, Sep 17, 2009 at 01:32:43PM +0200, Eugen Leitl wrote:
  reasons), you will lose 2 disks worth of storage to parity leaving 12
  disks worth of data. With raid10 you will lose half, 7 disks to
  parity/redundancy. With two raidz2 sets, you will get (5+2)+(5+2), that
  is 5+5 disks worth of storage and 2+2 disks worth of redundancy. The
  actual redudancy/parity is spread over all disks, not like raid3 which
  has a dedicated parity disk.
 
 So raidz3 has a dedicated parity disk? I couldn't see that from
 skimming http://blogs.sun.com/ahl/entry/triple_parity_raid_z

Note that Tomas was talking about RAID-3 not raidz3. To summarize the RAID
levels:

  RAID-0striping
  RAID-1mirror
  RAID-2ECC (basically not used)
  RAID-3bit-interleaved parity (basically not used)
  RAID-4block-interleaved parity
  RAID-5block-interleaved distributed parity
  RAID-6block-interleaved double distributed parity

raidz1 is most like RAID-5; raidz2 is most like RAID-6. There's no RAID
level that covers more than two parity disks, but raidz3 is most like RAID-6,
but with triple distributed parity.

Adam

-- 
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Thomas Burgess
it should be faster.  It really depends on what you are using it for though,
I've been using raidz for my system and i'm very happy with it.


On Wed, Sep 16, 2009 at 8:55 AM, en...@businessgrade.com wrote:

 Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs for
 both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24 disks
 hold any significant advantages over a RAIDZ pool?





 

 This email and any files transmitted with it are confidential and are
 intended solely for the use of the individual or entity to whom they are
 addressed. This communication may contain material protected by the
 attorney-client privilege. If you are not the intended recipient, be advised
 that any use, dissemination, forwarding, printing or copying is strictly
 prohibited. If you have received this email in error, please contact the
 sender and delete all copies.



 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Edward Ned Harvey
 Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs
 for both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24
 disks hold any significant advantages over a RAIDZ pool?

Generally speaking, striping mirrors will be faster than raidz or raidz2,
but it will require a higher number of disks and therefore higher cost to
get the same usable space.  The main reason to use raidz or raidz2 instead
of striping mirrors would be to keep the cost down, or to get higher usable
space out of a fixed number of drives.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread David Magda
On Wed, September 16, 2009 10:31, Edward Ned Harvey wrote:
 Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs
 for both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24
 disks hold any significant advantages over a RAIDZ pool?

 Generally speaking, striping mirrors will be faster than raidz or raidz2,
 but it will require a higher number of disks and therefore higher cost to
 get the same usable space.  The main reason to use raidz or raidz2 instead
 of striping mirrors would be to keep the cost down, or to get higher
 usable space out of a fixed number of drives.

And if you want space /and/ speed, then ZFS' hybrid storage pools is
something worth looking into.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread eneal

Quoting David Magda dma...@ee.ryerson.ca:


On Wed, September 16, 2009 10:31, Edward Ned Harvey wrote:

Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs
for both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24
disks hold any significant advantages over a RAIDZ pool?


Generally speaking, striping mirrors will be faster than raidz or raidz2,
but it will require a higher number of disks and therefore higher cost to
get the same usable space.  The main reason to use raidz or raidz2 instead
of striping mirrors would be to keep the cost down, or to get higher
usable space out of a fixed number of drives.


And if you want space /and/ speed, then ZFS' hybrid storage pools is
something worth looking into.


This is precisely my point. If I'm taking the hybrid approach - what  
advantages do mirrored pools hold over RAIDZ?

As I mentioned, a large amount of RAM, and SSD's for both L2arc and ZIL.




This email and any files transmitted with it are confidential and are  
intended solely for the use of the individual or entity to whom they  
are addressed. This communication may contain material protected by  
the attorney-client privilege. If you are not the intended recipient,  
be advised that any use, dissemination, forwarding, printing or  
copying is strictly prohibited. If you have received this email in  
error, please contact the sender and delete all copies.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Cindy . Swearingen

In addition, if you need the flexibility of moving disks around until
the device removal CR integrates, then mirrored pools are more flexible.

Detaching disks from a mirror isn't ideal but if you absolutely have
to reuse a disk temporarily then go with mirrors. See the output below.
You can replace disks in either configuration if you want to switch
smaller disks with larger disks, for example.

Cindy

# zpool status rzpool
  pool: rzpool
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Tue Sep 15 
14:41:24 2009

config:

NAMESTATE READ WRITE CKSUM
rzpool  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
c2t5d0  ONLINE   0 0 0
c2t6d0  ONLINE   0 0 0
spares
  c2t7d0AVAIL

errors: No known data errors
# zpool detach rzpool c2t6d0
cannot detach c2t6d0: only applicable to mirror and replacing vdevs
# zpool destroy rzpool
# zpool create mirpool mirror c2t0d0 c2t2d0 mirror c2t4d0 c2t6d0 spare 
c2t5d0

# zpool status mirpool
  pool: mirpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
mirpool ONLINE   0 0 0
  mirrorONLINE   0 0 0
c2t0d0  ONLINE   0 0 0
c2t2d0  ONLINE   0 0 0
  mirrorONLINE   0 0 0
c2t4d0  ONLINE   0 0 0
c2t6d0  ONLINE   0 0 0
spares
  c2t5d0AVAIL

errors: No known data errors
# zpool detach mirpool c2t6d0
#

On 09/16/09 08:31, Edward Ned Harvey wrote:

Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs
for both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24
disks hold any significant advantages over a RAIDZ pool?



Generally speaking, striping mirrors will be faster than raidz or raidz2,
but it will require a higher number of disks and therefore higher cost to
get the same usable space.  The main reason to use raidz or raidz2 instead
of striping mirrors would be to keep the cost down, or to get higher usable
space out of a fixed number of drives.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Scott Meilicke
I think in theory the ZIL/L2ARC should make things nice and fast if your 
workload includes sync requests (database, iscsi, nfs, etc.), regardless of the 
backend disks. But the only sure way to know is test with your work load.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Bob Friesenhahn

On Wed, 16 Sep 2009, en...@businessgrade.com wrote:

Hi. If I am using slightly more reliable SAS drives versus SATA, SSDs for 
both L2Arc and ZIL and lots of RAM, will a mirrored pool of say 24 disks hold 
any significant advantages over a RAIDZ pool?


A mirrored pool will support more IOPs.  This is even true when using 
SSDs for L2Arc and ZIL.  Using a SSD for the ZIL dramatically reduces 
synchronous write latency but the data still needs to be committed to 
backing store.  If the bulk of the synchronous writes are also random 
writes, then the throughput is still dependent on the IOPs capacity of 
the backing store.  Similarly, more RAM and/or a large SSD L2Arc 
improves the probability that a repeated read will be retrieved from 
the ARC rather than the backing store but this depends on the size of 
the working set, and whether the reads are ever repeated.  There are 
cases (e.g. daily backups) where reads are rarely repeated.


In summary, write IOPs are still write IOPs, and a read cache only 
works effectively for repeated reads (or reads of recently written 
data).


You still need to look at the nature of your workload in order to 
decide if RAIDZ is appropriate.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Marty Scholes
 Generally speaking, striping mirrors will be faster
 than raidz or raidz2,
 but it will require a higher number of disks and
 therefore higher cost to
 The main reason to use
 raidz or raidz2 instead
 of striping mirrors would be to keep the cost down,
 or to get higher usable
 space out of a fixed number of drives.

While it has been a while since I have done storage management for critical 
systems, the advantage I see with RAIDZN is better fault tolerance: any N 
drives may fail before  the set goes critical.

With straight mirroring, failure of the wrong two drives will invalidate the 
whole pool.

The advantage of striped mirrors is that it offers a better chance of higher 
iops (assuming the I/O is distributed correctly).  Also, it might be easier to 
expand a mirror by upgrading only two drives with larger drives.  With RAID, 
the entire stripe of drives would need to be upgraded.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Markus Kovero
It's possible to do 3-way (or more) mirrors too, so you may achieve better 
redundancy than raidz2/3

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Marty Scholes
Sent: 16. syyskuuta 2009 19:38
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] RAIDZ versus mirrroed

 Generally speaking, striping mirrors will be faster
 than raidz or raidz2,
 but it will require a higher number of disks and
 therefore higher cost to
 The main reason to use
 raidz or raidz2 instead
 of striping mirrors would be to keep the cost down,
 or to get higher usable
 space out of a fixed number of drives.

While it has been a while since I have done storage management for critical 
systems, the advantage I see with RAIDZN is better fault tolerance: any N 
drives may fail before  the set goes critical.

With straight mirroring, failure of the wrong two drives will invalidate the 
whole pool.

The advantage of striped mirrors is that it offers a better chance of higher 
iops (assuming the I/O is distributed correctly).  Also, it might be easier to 
expand a mirror by upgrading only two drives with larger drives.  With RAID, 
the entire stripe of drives would need to be upgraded.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Thomas Burgess
Mirrors are much quicker to replace if one DOES fail though...so i would
think that bad stuff could happen with EITHER solutionIf you buy a bunch
of hard drives for a raidz and they are all from the same batch they might
all fail around the same time...what if you have a raidz2 group and 2 drives
fail, then you're adding 2 drives back and another fails before it's
complete because it takes SO long to resilver? At least with mirrors they
resilver fast.

The bottom line is that bad stuff CAN happen and often does...so don't let
raidz or mirrors be the only solution you have.  Redundancy is good.

More redundancy is better... but backups are the best.

On Wed, Sep 16, 2009 at 1:23 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Sep 16, 2009, at 9:38 AM, Marty Scholes wrote:

 Generally speaking, striping mirrors will be faster
 than raidz or raidz2,
 but it will require a higher number of disks and
 therefore higher cost to
 The main reason to use
 raidz or raidz2 instead
 of striping mirrors would be to keep the cost down,
 or to get higher usable
 space out of a fixed number of drives.


 While it has been a while since I have done storage management for
 critical systems, the advantage I see with RAIDZN is better fault tolerance:
 any N drives may fail before  the set goes critical.

 With straight mirroring, failure of the wrong two drives will invalidate
 the whole pool.


 This line of reasoning doesn't get you very far.  It is much better to take
 a look at
 the mean time to data loss (MTTDL) for the various configurations.  I wrote
 a
 series of blogs to show how this is done.
 http://blogs.sun.com/relling/tags/mttdl

  -- richard


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Richard Elling

On Sep 16, 2009, at 10:42 AM, Thomas Burgess wrote:

Mirrors are much quicker to replace if one DOES fail though...so i  
would think that bad stuff could happen with EITHER solutionIf  
you buy a bunch of hard drives for a raidz and they are all from the  
same batch they might all fail around the same time...what if you  
have a raidz2 group and 2 drives fail, then you're adding 2 drives  
back and another fails before it's complete because it takes SO long  
to resilver? At least with mirrors they resilver fast.


In general, resilver is bound by either the media write bandwidth of  
the resilvering device
or the random IOP capacity of the remaining good drives. Although I  
don't know of any
studies comparing mirrors vs raidz resilvering, I would not expect  
much difference between

the two, all else held constant.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Thomas Burgess
hrm, i always thought raidz took longerlearn something every day =)


On Wed, Sep 16, 2009 at 2:14 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Sep 16, 2009, at 10:42 AM, Thomas Burgess wrote:

  Mirrors are much quicker to replace if one DOES fail though...so i would
 think that bad stuff could happen with EITHER solutionIf you buy a bunch
 of hard drives for a raidz and they are all from the same batch they might
 all fail around the same time...what if you have a raidz2 group and 2 drives
 fail, then you're adding 2 drives back and another fails before it's
 complete because it takes SO long to resilver? At least with mirrors they
 resilver fast.


 In general, resilver is bound by either the media write bandwidth of the
 resilvering device
 or the random IOP capacity of the remaining good drives. Although I don't
 know of any
 studies comparing mirrors vs raidz resilvering, I would not expect much
 difference between
 the two, all else held constant.
  -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Marty Scholes
 This line of reasoning doesn#39;t get you very far.
  It is much better to take a look atbr
 the mean time to data loss (MTTDL) for the various
 configurations.  I wrote abr
 series of blogs to show how this is done.br
 a href=http://blogs.sun.com/relling/tags/mttdl; 
 target=_blankhttp://blogs.sun.com/relling/tags/mttdl/abrbr

I will play the Devils advocate here and point out that the chart shows MTTDL 
for RAIDZ2, both 6 and 8 disk, is much better than mirroring.

The chart does show that three way mirroring is better still and I would guess 
that RAIDZ3 surpasses that.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Richard Elling

On Sep 16, 2009, at 12:50 PM, Marty Scholes wrote:


This line of reasoning doesn#39;t get you very far.
It is much better to take a look atbr
the mean time to data loss (MTTDL) for the various
configurations.  I wrote abr
series of blogs to show how this is done.br
a href=http://blogs.sun.com/relling/tags/mttdl; target=_blankhttp://blogs.sun.com/relling/tags/mttdl 
/abrbr


I will play the Devils advocate here and point out that the chart  
shows MTTDL for RAIDZ2, both 6 and 8 disk, is much better than  
mirroring.


The chart does show that three way mirroring is better still and I  
would guess that RAIDZ3 surpasses that.


Yes.  This is a mathematical way of saying lose any P+1 of N disks.

The important part is that the number of parity disks (or mirror sides)
is the big knob to use. But every choice is a trade-off.  For a single
set, the results should be intuitive. But as you vary the number of  
sets,

it quickly becomes easier to use the models.  For example, with a
Thumper, you have 48 disks and zillions of possible combinations
to choose from.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Bob Friesenhahn

On Wed, 16 Sep 2009, Thomas Burgess wrote:


hrm, i always thought raidz took longerlearn something every day =)


And you were probably right, in spite of Richard's lack of knowledge 
of a study or the feeling in his gut.  Just look at the many postings 
here about resilvering and you will see far more complaints about 
raidz taking a long time.


Resilver of mirrors will surely do better for large pools which 
continue to be used during the resilvering.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Marty Scholes
 Yes.  This is a mathematical way of saying
 lose any P+1 of N disks.

I am hesitant to beat this dead horse, yet it is a nuance that either I have 
completely misunderstood or many people I've met have completely missed.

Whether a stripe of mirrors or mirror of a stripes, any single failure makes 
the array critical, i.e. one failure from disaster.

For example, suppose a stripe of four sets of mirrors.  That stripe has 8 disks 
total: four data and four mirrors.  If one disk fails, say on mirror set 3, 
then set 3 is running on a single disk.  Should that remaining disk in set 3 
fail, the whole stripe is lost.  Yes, the stripe is safe as long as the next 
failure is not from set 3.

Contrast that to RAIDZ3.  Suppose seven total disks with the same effective 
pool size: 4 data and 3 parity.  If any single disk is lost then the array is 
not critical and can still survive any other loss.  In fact, it can survive a 
total of any three disk failures before it becomes critical.

I just see it too often where someone states that a stripe of four mirror sets 
can sustain four disk failures.  Yes, that's true, as long as the correct four 
disks fail.  If we could control which disks fail, then none of this would even 
be necessary, so that argument seems rather silly.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Richard Elling

On Sep 16, 2009, at 1:09 PM, Bob Friesenhahn wrote:


On Wed, 16 Sep 2009, Thomas Burgess wrote:

hrm, i always thought raidz took longerlearn something every  
day =)


And you were probably right, in spite of Richard's lack of knowledge  
of a study or the feeling in his gut.  Just look at the many  
postings here about resilvering and you will see far more complaints  
about raidz taking a long time.


Actually, I had a ton of data on resilvering which shows mirrors and
raidz equivalently bottlenecked on the media write bandwidth. However,
there are other cases which are IOPS bound (or CR bound :-) which
cover some of the postings here. I think Sommerfeld has some other
data which could be pertinent.
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Richard Elling

On Sep 16, 2009, at 1:29 PM, Marty Scholes wrote:

Yes.  This is a mathematical way of saying
lose any P+1 of N disks.


I am hesitant to beat this dead horse, yet it is a nuance that  
either I have completely misunderstood or many people I've met have  
completely missed.


Whether a stripe of mirrors or mirror of a stripes, any single  
failure makes the array critical, i.e. one failure from disaster.


For example, suppose a stripe of four sets of mirrors.  That stripe  
has 8 disks total: four data and four mirrors.  If one disk fails,  
say on mirror set 3, then set 3 is running on a single disk.  Should  
that remaining disk in set 3 fail, the whole stripe is lost.  Yes,  
the stripe is safe as long as the next failure is not from set 3.


Yes. I don't think I've blogged the data, but the MTTDL models will  
show that

RAID-1+0 has a higher MTTDL than RAID-0+1.

Contrast that to RAIDZ3.  Suppose seven total disks with the same  
effective pool size: 4 data and 3 parity.  If any single disk is  
lost then the array is not critical and can still survive any other  
loss.  In fact, it can survive a total of any three disk failures  
before it becomes critical.


Yes, but can you quantify this?  2x better?  5x better? 1.01x better?  
The

MTTDL models can help you quantify this.

I just see it too often where someone states that a stripe of four  
mirror sets can sustain four disk failures.  Yes, that's true, as  
long as the correct four disks fail.  If we could control which  
disks fail, then none of this would even be necessary, so that  
argument seems rather silly.


The MTTDL models account for this.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Eric Schrock

On 09/16/09 14:19, Richard Elling wrote:

On Sep 16, 2009, at 1:09 PM, Bob Friesenhahn wrote:


On Wed, 16 Sep 2009, Thomas Burgess wrote:


hrm, i always thought raidz took longerlearn something every day =)


And you were probably right, in spite of Richard's lack of knowledge 
of a study or the feeling in his gut.  Just look at the many postings 
here about resilvering and you will see far more complaints about 
raidz taking a long time.


Actually, I had a ton of data on resilvering which shows mirrors and
raidz equivalently bottlenecked on the media write bandwidth. However,
there are other cases which are IOPS bound (or CR bound :-) which
cover some of the postings here. I think Sommerfeld has some other
data which could be pertinent.


This primarily has to do with the stripe width and block size.  The 
difference between mirroring and RAID-Z is that with RAID-Z each ZFS 
block is again chunked up into smaller blocks and distributed across the 
stripe.  So if you have a wide stripe (i.e. 32), a 128k block can be 
chunked up into 4k blocks, while a small recordsize can be chunked even 
smaller (i.e. 8k to 1k or 512).


ZFS resilvering is metadata based to allow for efficient resilvering of 
outages, but when a relatively full disk needs to be replaced you end up 
bottlenecked on the metadata traversal.   If your blocks are chunked up 
small enough, this becomes a random I/O benchmark for the good disks in 
the RAID stripe.  If your pool is backed by 7200 RPM disks, this can end 
up taking a very long time.


The ZFS team is actively working on improvements in this area.

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Ross Walker
On Sep 16, 2009, at 4:29 PM, Marty Scholes martyscho...@yahoo.com  
wrote:



Yes.  This is a mathematical way of saying
lose any P+1 of N disks.


I am hesitant to beat this dead horse, yet it is a nuance that  
either I have completely misunderstood or many people I've met have  
completely missed.


Whether a stripe of mirrors or mirror of a stripes, any single  
failure makes the array critical, i.e. one failure from disaster.


For example, suppose a stripe of four sets of mirrors.  That stripe  
has 8 disks total: four data and four mirrors.  If one disk fails,  
say on mirror set 3, then set 3 is running on a single disk.  Should  
that remaining disk in set 3 fail, the whole stripe is lost.  Yes,  
the stripe is safe as long as the next failure is not from set 3.


Contrast that to RAIDZ3.  Suppose seven total disks with the same  
effective pool size: 4 data and 3 parity.  If any single disk is  
lost then the array is not critical and can still survive any other  
loss.  In fact, it can survive a total of any three disk failures  
before it becomes critical.


I just see it too often where someone states that a stripe of four  
mirror sets can sustain four disk failures.  Yes, that's true, as  
long as the correct four disks fail.  If we could control which  
disks fail, then none of this would even be necessary, so that  
argument seems rather silly.


There is another type of failure that mirrors help with and that is  
controller or path failures. If one side of a mirror set is on one  
controller or path and the other on another then a failure of one will  
not take down the set.


You can't get that with RAIDZn.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Bob Friesenhahn

On Wed, 16 Sep 2009, Ross Walker wrote:


There is another type of failure that mirrors help with and that is 
controller or path failures. If one side of a mirror set is on one controller 
or path and the other on another then a failure of one will not take down the 
set.


You can't get that with RAIDZn.


Sure you can.  Just make sure that 'n' is the same as the number of 
data disks, and make sure that each disk in the vdev is accessed via a 
unique controller path. Use raidz3 with six disks.  You probably need 
a lot of vdevs to make this even somewhat cost effective. :-)


Regardless, mirrors are known to be more resilient to temporary path 
failures.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Marion Hakanson
rswwal...@gmail.com said:
 There is another type of failure that mirrors help with and that is
 controller or path failures. If one side of a mirror set is on one
 controller or path and the other on another then a failure of one will   not
 take down the set.
 
 You can't get that with RAIDZn. 

You can if you have a stripe of RAIDZn's, and enough controllers
(or paths) to go around.  The raidz2 below should be able to survive
the loss of two controllers, shouldn't it?

Regards,

Marion


$ zpool status -v
  pool: zp1
 state: ONLINE
 scrub: scrub completed after 7h9m with 0 errors on Mon Sep 14 13:39:03 2009
config:

NAMESTATE READ WRITE CKSUM
bulk_zp01   ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t1d0  ONLINE   0 0 0
c1t1d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c5t1d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t2d0  ONLINE   0 0 0
c1t2d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 0
c5t2d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t3d0  ONLINE   0 0 0
c1t3d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0
c5t3d0  ONLINE   0 0 0
c6t3d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t4d0  ONLINE   0 0 0
c1t4d0  ONLINE   0 0 0
c4t4d0  ONLINE   0 0 0
c5t4d0  ONLINE   0 0 0
c6t4d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t5d0  ONLINE   0 0 0
c1t5d0  ONLINE   0 0 0
c4t5d0  ONLINE   0 0 0
c5t5d0  ONLINE   0 0 0
c6t5d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t6d0  ONLINE   0 0 0
c1t6d0  ONLINE   0 0 0
c4t6d0  ONLINE   0 0 0
c5t6d0  ONLINE   0 0 0
c6t6d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c0t7d0  ONLINE   0 0 0
c1t7d0  ONLINE   0 0 0
c4t7d0  ONLINE   0 0 0
c5t7d0  ONLINE   0 0 0
c6t7d0  ONLINE   0 0 0
c7t7d0  ONLINE   0 0 0
spares
  c0t0d0AVAIL
  c1t0d0AVAIL
  c4t0d0AVAIL
  c7t0d0AVAIL

errors: No known data errors
$ 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Ross Walker

On Sep 16, 2009, at 6:50 PM, Marion Hakanson hakan...@ohsu.edu wrote:


rswwal...@gmail.com said:

There is another type of failure that mirrors help with and that is
controller or path failures. If one side of a mirror set is on one
controller or path and the other on another then a failure of one  
will   not

take down the set.

You can't get that with RAIDZn.


You can if you have a stripe of RAIDZn's, and enough controllers
(or paths) to go around.  The raidz2 below should be able to survive
the loss of two controllers, shouldn't it?


It's not the stripes that make a difference, but the number of  
controllers there.


What's the system config on that puppy?

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Ross Walker
On Sep 16, 2009, at 6:43 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us 
 wrote:



On Wed, 16 Sep 2009, Ross Walker wrote:


There is another type of failure that mirrors help with and that is  
controller or path failures. If one side of a mirror set is on one  
controller or path and the other on another then a failure of one  
will not take down the set.


You can't get that with RAIDZn.


Sure you can.  Just make sure that 'n' is the same as the number of  
data disks, and make sure that each disk in the vdev is accessed via  
a unique controller path. Use raidz3 with six disks.  You probably  
need a lot of vdevs to make this even somewhat cost effective. :-)


Well yes, if you have an equal number of parity disks to data disks it  
would survive, but at that point what's the cost effectiveness to  
resilency ratio?


Regardless, mirrors are known to be more resilient to temporary path  
failures.


As another list member pointed out you could also avoid the issue by  
having a raidz disk per controller. But if I'm buying that kind of big  
iron I might just opt for a 3par or emc and save myself the work, and  
probably some $ too.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAIDZ versus mirrroed

2009-09-16 Thread Richard Elling

On Sep 16, 2009, at 7:17 PM, Ross Walker wrote:

more resilient to temporary path failures.


As another list member pointed out you could also avoid the issue by  
having a raidz disk per controller. But if I'm buying that kind of  
big iron I might just opt for a 3par or emc and save myself the  
work, and probably some $ too.


In general, for SAS or SATA, having separate controllers does little  
to improve

data availability. The reason is because SAS and SATA are point-to-point
or point-to-switch-to-point architectures and you don't have the  
shared bus

issues that plague parallel SCSI or IDE. The controllers themselves are
approximately an order of magnitude more reliable than your CPU  and are
around two orders of magnitude more reliable than your disk. Put your
redundancy where your reliability is weak (disk), if you want to improve
availability.
http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_vs

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss