Re[2]: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-13 Thread Robert Milkowski
Hello Frank,

Tuesday, September 12, 2006, 9:41:05 PM, you wrote:

FC> It would be interesting to have a zfs enabled HBA to offload the checksum
FC> and parity calculations.  How much of zfs would such an HBA have to
FC> understand?

That won't be end-to-end checksuming anymore, right?
That way you can disable ZFS checksuming at all and base only on HW
RAID.


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-13 Thread Casper . Dik

>We're back into the old argument of "put it on a co-processor, then move 
>it onto the CPU, then move it back onto a co-processor" cycle.   
>Personally, with modern CPUs being so under-utilized these days, and all 
>ZFS-bound data having to move through main memory in any case (whether 
>hardware checksum-assisted or not), use the CPU.  Hardware-assist for 
>checksum sounds nice, but I can't think of it actually being more 
>efficient that doing it on the CPU (it won't actually help performance), 
>so why bother with extra hardware?

Plus it moves part of the resiliency away from where we knew the data
was good (the CPU/computer) across a bus/fabric/whatnot possibly
causing checksums to be computed over incorrect data.

We already see that with IP checksuming off-loading and broken hardware
and broken VLAN switches recomputing the ethernet CRC.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-13 Thread Erik Trimble

James C. McPherson wrote:

Richard Elling wrote:

Frank Cusack wrote:
It would be interesting to have a zfs enabled HBA to offload the 
checksum

and parity calculations.  How much of zfs would such an HBA have to
understand?

[warning: chum]
Disagree.  HBAs are pretty wimpy.  It is much less expensive and more
efficient to move that (flexible!) function into the main CPUs.


I think Richard is in the groove here. All the hba chip
implementation documentation that I've seen (publicly
available of course) indicates that these chips are
already highly optimized engines, and I don't think that
adding extra functionality like checksum and parity
calculations would be an efficient use of silicon/SoI.

cheers,
James


HBAs work on an entirely different layer than what checksumming data 
would be efficient at.


If we're using the OSI-style model for this type, HBAs work at layer 
1.   And,  as James mentioned, they are highly specialized ASICs for 
doing just  bus-level communications. It's not like there is extra 
general-purposes compute power available (or, even can possibly be 
built-in).  Checksumming for ZFS requires filesystem-level knowledge, 
which is effectively up at OSI layer 6 or 7, and well beyond the 
understanding of a lowly HBA (it's just passing bits back and forth, and 
has no conception of what they mean).


Essentially, moving block checksumming into the HBA would at best be 
similar to what we see with super-low-cost RAID controllers and the XOR 
function.  Remember how well that works? 

Now, building ZFS-style checksum capability (or, just hardware checksum 
capability for ZFS to call) is indeed proper and possible for _real_ 
hardware RAID controllers, as they are much more akin to standard 
general-purpose CPUs (indeed, most now use a GP processor anyway). 

We're back into the old argument of "put it on a co-processor, then move 
it onto the CPU, then move it back onto a co-processor" cycle.   
Personally, with modern CPUs being so under-utilized these days, and all 
ZFS-bound data having to move through main memory in any case (whether 
hardware checksum-assisted or not), use the CPU.  Hardware-assist for 
checksum sounds nice, but I can't think of it actually being more 
efficient that doing it on the CPU (it won't actually help performance), 
so why bother with extra hardware?


-Erik


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread James C. McPherson

Richard Elling wrote:

Frank Cusack wrote:

It would be interesting to have a zfs enabled HBA to offload the checksum
and parity calculations.  How much of zfs would such an HBA have to
understand?

[warning: chum]
Disagree.  HBAs are pretty wimpy.  It is much less expensive and more
efficient to move that (flexible!) function into the main CPUs.


I think Richard is in the groove here. All the hba chip
implementation documentation that I've seen (publicly
available of course) indicates that these chips are
already highly optimized engines, and I don't think that
adding extra functionality like checksum and parity
calculations would be an efficient use of silicon/SoI.

cheers,
James


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread Richard Elling

Frank Cusack wrote:

It would be interesting to have a zfs enabled HBA to offload the checksum
and parity calculations.  How much of zfs would such an HBA have to
understand?

[warning: chum]
Disagree.  HBAs are pretty wimpy.  It is much less expensive and more
efficient to move that (flexible!) function into the main CPUs.

-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread Frank Cusack
On September 12, 2006 11:35:54 AM -0700 UNIX admin <[EMAIL PROTECTED]> 
wrote:

There are also the speed enhancement provided by a HW
raid array, and
usually RAS too,  compared to a native disk drive but
the numbers on
that are still coming in and being analyzed. (See
previous threads.)


It would be nice if you would attribute your quotes.  Maybe this is a
limitation of the web interface?


Speed enhancements? What is the baseline of comparison?

Hardware RAIDs can be banalized to two features: cache which does data
reordering for optimal disk writes and parity calculation which is being
offloaded off of the server's CPU.

But HW calculations still take time, and the in-between, battery backed
cache serves to replace the individual disk caches, because of the
traditional file system approach which had to have some assurance that
the data made it to disk in one way or another.

With ZFS however the in-between cache is obsolete, as individual disk
caches can be used directly. I also openly question whether even the
dedicated RAID HW is faster than the newest CPUs in modern servers.

Unless there is something that I'm missing, I fail to see the benefit of
a HW RAID in tandem with ZFS. In my view, this holds especially true when
one gets into SAN storage like SE6920, EMC and Hitachi products.


I agree with your basic point, that the HW RAID cache is obsoleted by zfs
(which seems to be substantiated here by benchmark results), but I think
you slightly mischaracterize its use.  The speed of the HW RAID CPU is
irrelevant; the parity is XOR which is extremely fast with any CPU when
compared to disk write speed.

What is relevant is, as Anton points out, the CPU cache on the host system.
Parity calculations kill the cache and will hurt memory-intensive apps.
So in this case, offloading it may help in the ufs case.  (Not for zfs,
as I understand from reading here, since checksums still have to be done.
I would argue that this is *absolutely essential* [and zfs obsoletes all
other filesystems] and therefore the gain in the ufs on HW RAID-5 case is
worthless due to the correctness tradeoff.)

It would be interesting to have a zfs enabled HBA to offload the checksum
and parity calculations.  How much of zfs would such an HBA have to
understand?

-frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread Roch - PAE

Anton B. Rang writes:

   > The bigger  problem with system  utilization for software
  RAID is  the  cache,   not the  CPU  cycles proper.   Simply
  preparing to write 1 MB of data will flush half of a 2 MB L2
  cache. This  hurts overall system  performance far more than
  the few microseconds that XORing the data takes.
   > 


With ZFS,   on most deployments we'llbring the data into
cache for the checksums; so I guess that the raid-z cost
will be just incremental.

Now would we gain anything at generating ZFS functions for 
'checksum+parity', 'checksum+parity+compression' ?


-r

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-09 Thread Anton Rang

On Sep 9, 2006, at 1:32 AM, Frank Cusack wrote:

On September 7, 2006 12:25:47 PM -0700 "Anton B. Rang"  
<[EMAIL PROTECTED]> wrote:

The bigger problem with system utilization for software RAID is the
cache, not the CPU cycles proper. Simply preparing to write 1 MB  
of data
will flush half of a 2 MB L2 cache. This hurts overall system  
performance

far more than the few microseconds that XORing the data takes.


Interesting.  So does this in any way invalidate benchmarks  
recently posted
here which showed raidz on jbod to outperform a zfs stripe on HW  
raid5?


No.  There are, in fact, two reasons why RAID-Z is likely to outperform
hardware RAID 5, at least in certain types of I/O benchmarks.  First,
RAID-5 requires read-modify-write cycles when full stripes aren't being
written; and ZFS tends to issue small and pretty much random I/O (in my
experience), which is the worst case for RAID-5.  Second, performing  
RAID

on the main CPU is faster, or at least just as fast, as in hardware.

There are also cases where hardware RAID 5 will likely outperform ZFS.
One is when there is a large RAM cache (which is not being flushed by
ZFS -- one issue to be addressed is that the commands ZFS uses to  
control

the write cache on plain disks tend to effectively disable the NVRAM
cache on hardware RAID controllers).  Another is when the I/O bandwidth
being used is near the maximum capacity of the host channel, because
doing software RAID requires moving more data over this channel.  (If
you have sufficient channels to dedicate one per disk, as is the case
with SATA, this doesn't come into play.)  This is particularly
noticeable during reconstruction, since the channels are being used
both to transfer data & reconstruct it, where in a hardware RAID-5
box (of moderate cost, at least) they are typically overprovisioned.
A third is if the system CPU or memory bandwidth is heavily used by
your application; for instance, a database running under heavy load.
In this case, the added CPU, cache, and memory bandwidth of software
RAID will stress the application.

Ultimately, you do want to use your actual application as the  
benchmark,

but certainly generic benchmarks should at least be helpful.


They're helpful in measuring what the benchmark measures.  ;-)  If the
benchmark measures how quickly you can get data from host RAM to disk,
which is typically the case, it won't tell you anything about how much
CPU was used in the process.  Real applications, however, often care.
There's a reason why we use interrupt-driven controllers, even though
you get better performance of the I/O itself with polling.  :-)

Anton

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-08 Thread Frank Cusack
On September 7, 2006 12:25:47 PM -0700 "Anton B. Rang" <[EMAIL PROTECTED]> 
wrote:

The bigger problem with system utilization for software RAID is the
cache, not the CPU cycles proper. Simply preparing to write 1 MB of data
will flush half of a 2 MB L2 cache. This hurts overall system performance
far more than the few microseconds that XORing the data takes.


Interesting.  So does this in any way invalidate benchmarks recently posted
here which showed raidz on jbod to outperform a zfs stripe on HW raid5?
(That's my recollection, perhaps it's a mischaracterization or just plain
wrong.)  I mean, even if raid-z on jbod in a filesystem benchmark is a
winner, when you have an actual application with a working set that is
more than filesystem data, the benchmark results would be misleading.

Ultimately, you do want to use your actual application as the benchmark,
but certainly generic benchmarks should at least be helpful.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Frank Cusack
On September 8, 2006 5:59:47 PM -0700 Richard Elling - PAE 
<[EMAIL PROTECTED]> wrote:

Ed Gould wrote:

On Sep 8, 2006, at 11:35, Torrey McMahon wrote:

If I read between the lines here I think you're saying that the raid
functionality is in the chipset but the management can only be done by
software running on the outside. (Right?)


No.  All that's in the chipset is enough to read a RAID volume for
boot.  Block layout, RAID-5 parity calculations, and the rest are all
done in the software.  I wouldn't be surprised if RAID-5 parity checking
was absent on read for boot, but I don't actually know.


At Sun, we often use the LSI Logic LSISAS1064 series of SAS RAID
controllers
on motherboards for many products.  [LSI claims support for Solaris 2.6!]
These controllers  have a builtin microcontroller(ARM 926, IIRC),
firmware,
and nonvolatile memory (NVSRAM) for implementing the RAID features.  We
manage
them through BIOS, OBP, or raidctl(1m).  As Torrey says, very much like
the A1000.
Some of the fancier LSI products offer RAID 5, too.


Yes, some (many) of the RAID controllers do all the RAID in the hardware.
I don't see where Ed was disputing that.

But there will always be a [large] market for cheaper but less capable
products and so at least for awhile to come there will be these not-quite-
RAID cards.  Probably for a very long while.

winmodem, anyone?

-frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Richard Elling - PAE

Ed Gould wrote:

On Sep 8, 2006, at 11:35, Torrey McMahon wrote:
If I read between the lines here I think you're saying that the raid 
functionality is in the chipset but the management can only be done by 
software running on the outside. (Right?)


No.  All that's in the chipset is enough to read a RAID volume for 
boot.  Block layout, RAID-5 parity calculations, and the rest are all 
done in the software.  I wouldn't be surprised if RAID-5 parity checking 
was absent on read for boot, but I don't actually know.


At Sun, we often use the LSI Logic LSISAS1064 series of SAS RAID controllers
on motherboards for many products.  [LSI claims support for Solaris 2.6!]
These controllers  have a builtin microcontroller(ARM 926, IIRC), firmware,
and nonvolatile memory (NVSRAM) for implementing the RAID features.  We manage
them through BIOS, OBP, or raidctl(1m).  As Torrey says, very much like the 
A1000.
Some of the fancier LSI products offer RAID 5, too.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Bennett, Steve

> Dunno about eSATA jbods, but eSATA host ports have
> appeared on at least two HDTV-capable DVRs for storage
> expansion (looks like one model of the Scientific Atlanta
> cable box DVR's as well as on the shipping-any-day-now
> Tivo Series 3).  
> 
> It's strange that they didn't go with firewire since it's 
> already widely used for digital video.

Cost? If you use eSata it's pretty much just a physical connector onto
the board, whereas I guess firewire needs a 1394 interface (couple of
dollars?) plus a royalty to all the patent holders.

It's probably not much, but I can't see how there can be *any* margin in
consumer electronics these days...

Steve.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Jonathan Edwards


On Sep 8, 2006, at 14:22, Ed Gould wrote:


On Sep 8, 2006, at 9:33, Richard Elling - PAE wrote:
I was looking for a new AM2 socket motherboard a few weeks ago.   
All of the ones
I looked at had 2xIDE and 4xSATA with onboard (SATA) RAID.  All  
were less than $150.
In other words, the days of having a JBOD-only solution are over  
except for single

disk systems.  4x750 GBytes is a *lot* of data (and video).


It's not clear to me that JBOD is dead.  The (S)ATA RAID cards I've  
seen are really software RAID solutions that know just enough in  
the controller to let the BIOS boot off a RAID volume.  None of the  
expensive RAID stuff is in the controller.


additionally the only RAID many support favor just mirroring and  
striping (RAID 0, 1, 10, etc) not as many do parity.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Ed Gould

On Sep 8, 2006, at 11:35, Torrey McMahon wrote:
If I read between the lines here I think you're saying that the raid 
functionality is in the chipset but the management can only be done by 
software running on the outside. (Right?)


No.  All that's in the chipset is enough to read a RAID volume for 
boot.  Block layout, RAID-5 parity calculations, and the rest are all 
done in the software.  I wouldn't be surprised if RAID-5 parity 
checking was absent on read for boot, but I don't actually know.


--Ed

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Torrey McMahon

Ed Gould wrote:

On Sep 8, 2006, at 9:33, Richard Elling - PAE wrote:
I was looking for a new AM2 socket motherboard a few weeks ago.  All 
of the ones
I looked at had 2xIDE and 4xSATA with onboard (SATA) RAID.  All were 
less than $150.
In other words, the days of having a JBOD-only solution are over 
except for single

disk systems.  4x750 GBytes is a *lot* of data (and video).


It's not clear to me that JBOD is dead.  The (S)ATA RAID cards I've 
seen are really software RAID solutions that know just enough in the 
controller to let the BIOS boot off a RAID volume.  None of the 
expensive RAID stuff is in the controller.



If I read between the lines here I think you're saying that the raid 
functionality is in the chipset but the management can only be done by 
software running on the outside. (Right?)


A1000 anyone? :)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Ed Gould

On Sep 8, 2006, at 9:33, Richard Elling - PAE wrote:
I was looking for a new AM2 socket motherboard a few weeks ago.  All 
of the ones
I looked at had 2xIDE and 4xSATA with onboard (SATA) RAID.  All were 
less than $150.
In other words, the days of having a JBOD-only solution are over 
except for single

disk systems.  4x750 GBytes is a *lot* of data (and video).


It's not clear to me that JBOD is dead.  The (S)ATA RAID cards I've 
seen are really software RAID solutions that know just enough in the 
controller to let the BIOS boot off a RAID volume.  None of the 
expensive RAID stuff is in the controller.


--Ed

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Bill Sommerfeld
On Fri, 2006-09-08 at 09:33 -0700, Richard Elling - PAE wrote:
> There has been some recent discussion about eSATA JBODs in the press.  I'm not
> sure they will gain much market share.  iPods and flash drives have a much 
> larger
> market share.

Dunno about eSATA jbods, but eSATA host ports have appeared on at least
two HDTV-capable DVRs for storage expansion (looks like one model of the
Scientific Atlanta cable box DVR's as well as on the
shipping-any-day-now Tivo Series 3).  

It's strange that they didn't go with firewire since it's already widely
used for digital video.  

- Bill








___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Richard Elling - PAE

[EMAIL PROTECTED] wrote:

I don't quite see this in my crystal ball.  Rather, I see all of the SAS/SATA
chipset vendors putting RAID in the chipset.  Basically, you can't get a
"dumb" interface anymore, except for fibre channel :-).  In other words, if
we were to design a system in a chassis with perhaps 8 disks, then we would
also use a controller which does RAID.  So, we're right back to square 1.


Richard, when I talk about cheap JBOD I think about home users/small
servers/small companies. I guess you can sell 100 X4500 and at the same
time 1000 (or even more) cheap JBODs to the small companies which for sure
will not buy the big boxes. Yes, I know, you earn more selling
X4500. But what do you think, how Linux found its way to data centers
and become important player in OS space ? Through home users/enthusiasts who
become familiar with it and then started using the familiar things in
their job. 


I was looking for a new AM2 socket motherboard a few weeks ago.  All of the ones
I looked at had 2xIDE and 4xSATA with onboard (SATA) RAID.  All were less than 
$150.
In other words, the days of having a JBOD-only solution are over except for 
single
disk systems.  4x750 GBytes is a *lot* of data (and video).

There has been some recent discussion about eSATA JBODs in the press.  I'm not
sure they will gain much market share.  iPods and flash drives have a much 
larger
market share.


Proven way to achieve "world domination".  ;-))


Dang!  I was planning to steal a cobalt bomb and hold the world hostage while
I relax in my space station... zero-G whee! :-)
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Re[2]: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-08 Thread Roch - PAE

zfs "hogs all the ram" under a sustained heavy write load. This is 
being tracked by:

6429205 each zpool needs to monitor it's  throughput and throttle heavy 
writers

-r

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-08 Thread Robert Milkowski
Hello James,

Thursday, September 7, 2006, 8:58:10 PM, you wrote:
JD> with ZFS I have found that memory is a much greater limitation, even
JD> my dual 300mhz u2 has no problem filling 2x 20MB/s scsi channels, even
JD> with compression enabled,  using raidz and 10k rpm 9GB drives, thanks
JD> to its 2GB of ram it does great at everything I throw at it. On the
JD> other hand my blade 1500 ram  512MB with 3x 18GB 10k rpm drives using
JD> 2x 40MB/s scsi channels , os is on a 80GB ide drive, has problems
JD> interactively because as soon as you push zfs hard it hogs all the ram
JD> and may take 5 or 10 seconds to get response on xterms while the
JD> machine clears out ram and loads its applications/data back into ram.

IIRC correctly there's is a bug in SPARC ata driver which when
combined with ZFS expresses itself.

Unless you use only ZFS on those SCSI drives...?


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread przemolicc
On Fri, Sep 08, 2006 at 09:41:58AM +0100, Darren J Moffat wrote:
> [EMAIL PROTECTED] wrote:
> >Richard, when I talk about cheap JBOD I think about home users/small
> >servers/small companies. I guess you can sell 100 X4500 and at the same
> >time 1000 (or even more) cheap JBODs to the small companies which for sure
> >will not buy the big boxes. Yes, I know, you earn more selling
> >X4500. But what do you think, how Linux found its way to data centers
> >and become important player in OS space ? Through home users/enthusiasts 
> >who
> >become familiar with it and then started using the familiar things in
> >their job. 
> 
> But Linux isn't a hardware vendor and doesn't make cheap JBOD or 
> multipack for the home user.

Linux is used as a symbol.

> So I don't see how we get from "Sun should make cheap home user JBOD" 
> (which BTW we don't really have the channel to sell for anyway) to "but 
> Linux dominated this way".

"Home user" = tech/geek/enthusiasts who is an admin in job

[ Linux ]
"Home user" is using linux at home and is satisfied with it. He/she then goes 
to job and says
"Let's install/use it on less important servers". He/she (and
management) is again satisfied with it. So lets use it at more important
servers ... etc.

[ ZFS ]
"Home user" is using ZFS (Solaris) at home (remember easiness and even WEB
interface to ZFS operations !,) to keep photos, musics, etc. and is satisfied 
with it.
He/she the goes to his/her job and says "I use for a while a fantastic
filesystem". Lets use it on less important servers". Ok. Later on "Works ok.
Let's use on more important ". Etc...

Yes, I know, a bit naive. But remember that not only Linux spreads this
way but also Solaris as well. I guess most of downloaded Solaris CD/DVD
are for x86. You as a company "attack" at high end/midrange level. Let
users/admins/fans "attack" at lower end level.


przemol
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread Darren J Moffat

[EMAIL PROTECTED] wrote:

Richard, when I talk about cheap JBOD I think about home users/small
servers/small companies. I guess you can sell 100 X4500 and at the same
time 1000 (or even more) cheap JBODs to the small companies which for sure
will not buy the big boxes. Yes, I know, you earn more selling
X4500. But what do you think, how Linux found its way to data centers
and become important player in OS space ? Through home users/enthusiasts who
become familiar with it and then started using the familiar things in
their job. 


But Linux isn't a hardware vendor and doesn't make cheap JBOD or 
multipack for the home user.


So I don't see how we get from "Sun should make cheap home user JBOD" 
(which BTW we don't really have the channel to sell for anyway) to "but 
Linux dominated this way".



--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-08 Thread Roch - PAE

Torrey McMahon writes:
 > Nicolas Dorfsman wrote:
 > >> The hard part is getting a set of simple
 > >> requirements. As you go into 
 > >> more complex data center environments you get hit
 > >> with older Solaris 
 > >> revs, other OSs, SOX compliance issues, etc. etc.
 > >> etc. The world where 
 > >> most of us seem to be playing with ZFS is on the
 > >> lower end of the 
 > >> complexity scale. Sure, throw your desktop some fast
 > >> SATA drives. No 
 > >> problem. Oh wait, you've got ten Oracle DBs on three
 > >> E25Ks that need to 
 > >> be backed up every other blue moon ...
 > >> 
 > >
 > >   Another fact is CPU use.
 > >
 > >   Does anybody really know what will be effects of intensive CPU workload 
 > > on ZFS perfs, and effects of ZFS RAID CPU compute on intensive CPU 
 > > workload ?
 > >
 > >   I heard a story about a customer complaining about his higend server 
 > > performances; when a guy came on site...and discover beautiful SVM RAID-5 
 > > volumes, the solution was almost found.
 > >   
 > 
 > Raid calculations take CPU time but I haven't seen numbers on ZFS usage. 
 > SVM is known for using a fair bit of CPU when performing R5 calculations 
 > and I'm sure other OS have the same issue. EMC used to go around saying 
 > that offloading raid calculations to their storage arrays would increase 
 > application performance because you would free up CPU time to do other 
 > stuff. The "EMC effect" is how they used to market it.
 > 
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


I just measured quickly that a 1.2Ghz sparc can do [400-500]MB/sec
ofencoding(time  spent  in   misnamed   function
vdev_raidz_reconstruct)  for   a  3  disk raid-z group. Bigger
groups, should cost more but I'd also expect the cost to
decrease with increase CPU frequency.

Note that, the raidz cost is impacted by this:
6460622 zio_nowait() doesn't live up to its name

-r

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-08 Thread przemolicc
On Thu, Sep 07, 2006 at 12:14:20PM -0700, Richard Elling - PAE wrote:
> [EMAIL PROTECTED] wrote:
> >This is the case where I don't understand Sun's politics at all: Sun
> >doesn't offer really cheap JBOD which can be bought just for ZFS. And
> >don't even tell me about 3310/3320 JBODs - they are horrible expansive :-(
> 
> Yep, multipacks are EOL for some time now -- killed by big disks.  Back when
> disks were small, people would buy multipacks to attach to their 
> workstations.
> There was a time when none of the workstations had internal disks, but I'd
> be dating myself :-)
> 
> For datacenter-class storage, multipacks were not appropriate.  They only
> had single-ended SCSI interfaces which have a limited cable budget which
> limited their use in racks.  Also, they weren't designed to be used in a
> rack environment, so they weren't mechanically appropriate either.  I 
> suppose
> you can still find them on eBay.
> >If Sun wants ZFS to be absorbed quicker it should have such _really_ cheap
> >JBOD.
> 
> I don't quite see this in my crystal ball.  Rather, I see all of the 
> SAS/SATA
> chipset vendors putting RAID in the chipset.  Basically, you can't get a
> "dumb" interface anymore, except for fibre channel :-).  In other words, if
> we were to design a system in a chassis with perhaps 8 disks, then we would
> also use a controller which does RAID.  So, we're right back to square 1.

Richard, when I talk about cheap JBOD I think about home users/small
servers/small companies. I guess you can sell 100 X4500 and at the same
time 1000 (or even more) cheap JBODs to the small companies which for sure
will not buy the big boxes. Yes, I know, you earn more selling
X4500. But what do you think, how Linux found its way to data centers
and become important player in OS space ? Through home users/enthusiasts who
become familiar with it and then started using the familiar things in
their job. 

Proven way to achieve "world domination".  ;-))

przemol
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread Anton B. Rang
The bigger problem with system utilization for software RAID is the cache, not 
the CPU cycles proper. Simply preparing to write 1 MB of data will flush half 
of a 2 MB L2 cache. This hurts overall system performance far more than the few 
microseconds that XORing the data takes.

(A similar effect occurs with file system buffering, and this is one reason why 
direct I/O is attractive for databases — there’s no pollution of the system 
cache.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-07 Thread Richard Elling - PAE

[EMAIL PROTECTED] wrote:

This is the case where I don't understand Sun's politics at all: Sun
doesn't offer really cheap JBOD which can be bought just for ZFS. And
don't even tell me about 3310/3320 JBODs - they are horrible expansive :-(


Yep, multipacks are EOL for some time now -- killed by big disks.  Back when
disks were small, people would buy multipacks to attach to their workstations.
There was a time when none of the workstations had internal disks, but I'd
be dating myself :-)

For datacenter-class storage, multipacks were not appropriate.  They only
had single-ended SCSI interfaces which have a limited cable budget which
limited their use in racks.  Also, they weren't designed to be used in a
rack environment, so they weren't mechanically appropriate either.  I suppose
you can still find them on eBay.


If Sun wants ZFS to be absorbed quicker it should have such _really_ cheap
JBOD.


I don't quite see this in my crystal ball.  Rather, I see all of the SAS/SATA
chipset vendors putting RAID in the chipset.  Basically, you can't get a
"dumb" interface anymore, except for fibre channel :-).  In other words, if
we were to design a system in a chassis with perhaps 8 disks, then we would
also use a controller which does RAID.  So, we're right back to square 1.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread James Dickens

On 9/7/06, Torrey McMahon <[EMAIL PROTECTED]> wrote:

Nicolas Dorfsman wrote:
>> The hard part is getting a set of simple
>> requirements. As you go into
>> more complex data center environments you get hit
>> with older Solaris
>> revs, other OSs, SOX compliance issues, etc. etc.
>> etc. The world where
>> most of us seem to be playing with ZFS is on the
>> lower end of the
>> complexity scale. Sure, throw your desktop some fast
>> SATA drives. No
>> problem. Oh wait, you've got ten Oracle DBs on three
>> E25Ks that need to
>> be backed up every other blue moon ...
>>
>
>   Another fact is CPU use.
>
>   Does anybody really know what will be effects of intensive CPU workload on 
ZFS perfs, and effects of ZFS RAID CPU compute on intensive CPU workload ?
>

with ZFS I have found that memory is a much greater limitation, even
my dual 300mhz u2 has no problem filling 2x 20MB/s scsi channels, even
with compression enabled,  using raidz and 10k rpm 9GB drives, thanks
to its 2GB of ram it does great at everything I throw at it. On the
other hand my blade 1500 ram  512MB with 3x 18GB 10k rpm drives using
2x 40MB/s scsi channels , os is on a 80GB ide drive, has problems
interactively because as soon as you push zfs hard it hogs all the ram
and may take 5 or 10 seconds to get response on xterms while the
machine clears out ram and loads its applications/data back into ram.

James Dickens
uadmin.blogspot.com



>   I heard a story about a customer complaining about his higend server 
performances; when a guy came on site...and discover beautiful SVM RAID-5 volumes, 
the solution was almost found.
>

Raid calculations take CPU time but I haven't seen numbers on ZFS usage.
SVM is known for using a fair bit of CPU when performing R5 calculations
and I'm sure other OS have the same issue. EMC used to go around saying
that offloading raid calculations to their storage arrays would increase
application performance because you would free up CPU time to do other
stuff. The "EMC effect" is how they used to market it.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread Peter Rival

Richard Elling - PAE wrote:

Torrey McMahon wrote:
Raid calculations take CPU time but I haven't seen numbers on ZFS 
usage. SVM is known for using a fair bit of CPU when performing R5 
calculations and I'm sure other OS have the same issue. EMC used to go 
around saying that offloading raid calculations to their storage 
arrays would increase application performance because you would free 
up CPU time to do other stuff. The "EMC effect" is how they used to 
market it.


In all modern processors, and most ancient processors, XOR takes 1 CPU
cycle and is easily pipelined.  Getting the data from the disk to the 
registers
takes thousands or hundreds of thousands of CPU cycles.  You will more 
likely
feel the latency of the read-modify-write for RAID-5 than the CPU time 
needed

for XOR.  ZFS avoids the read-modify-write, but does compression, so it is
possible that a few more CPU cycles will be used.  But it should still be a
big win because CPU cycles are less expensive than disk I/O.  Meanwhile, I
think we're all looking for good data on this.
 -- richard


I believe the true answer is (wait for it...) It Depends(TM) on what you're 
limited on.  If your system under your load is CPU constrained, ZFS calculating 
the RAIDZ parity (and checksum) is going to hurt; if you are IO constrained 
then having the otherwise idle CPU do (which is, of course, more than just an 
XOR instruction, but we all know that) the work may help.  The ZFS design 
center of mostly-idle CPUs is not always accurate, although most customers 
don't dare push the system to 100% utilization.  It's when you _do_ hit that 
point, or when the extra overhead unexpectedly makes you hit or go beyond that 
point that things can get interesting quickly.

- Pete
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread Richard Elling - PAE

Torrey McMahon wrote:
Raid calculations take CPU time but I haven't seen numbers on ZFS usage. 
SVM is known for using a fair bit of CPU when performing R5 calculations 
and I'm sure other OS have the same issue. EMC used to go around saying 
that offloading raid calculations to their storage arrays would increase 
application performance because you would free up CPU time to do other 
stuff. The "EMC effect" is how they used to market it.


In all modern processors, and most ancient processors, XOR takes 1 CPU
cycle and is easily pipelined.  Getting the data from the disk to the registers
takes thousands or hundreds of thousands of CPU cycles.  You will more likely
feel the latency of the read-modify-write for RAID-5 than the CPU time needed
for XOR.  ZFS avoids the read-modify-write, but does compression, so it is
possible that a few more CPU cycles will be used.  But it should still be a
big win because CPU cycles are less expensive than disk I/O.  Meanwhile, I
think we're all looking for good data on this.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread Torrey McMahon

Nicolas Dorfsman wrote:

The hard part is getting a set of simple
requirements. As you go into 
more complex data center environments you get hit
with older Solaris 
revs, other OSs, SOX compliance issues, etc. etc.
etc. The world where 
most of us seem to be playing with ZFS is on the
lower end of the 
complexity scale. Sure, throw your desktop some fast
SATA drives. No 
problem. Oh wait, you've got ten Oracle DBs on three
E25Ks that need to 
be backed up every other blue moon ...



  Another fact is CPU use.

  Does anybody really know what will be effects of intensive CPU workload on 
ZFS perfs, and effects of ZFS RAID CPU compute on intensive CPU workload ?

  I heard a story about a customer complaining about his higend server 
performances; when a guy came on site...and discover beautiful SVM RAID-5 
volumes, the solution was almost found.
  


Raid calculations take CPU time but I haven't seen numbers on ZFS usage. 
SVM is known for using a fair bit of CPU when performing R5 calculations 
and I'm sure other OS have the same issue. EMC used to go around saying 
that offloading raid calculations to their storage arrays would increase 
application performance because you would free up CPU time to do other 
stuff. The "EMC effect" is how they used to market it.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-07 Thread Nicolas Dorfsman
> The hard part is getting a set of simple
> requirements. As you go into 
> more complex data center environments you get hit
> with older Solaris 
> revs, other OSs, SOX compliance issues, etc. etc.
> etc. The world where 
> most of us seem to be playing with ZFS is on the
> lower end of the 
> complexity scale. Sure, throw your desktop some fast
> SATA drives. No 
> problem. Oh wait, you've got ten Oracle DBs on three
> E25Ks that need to 
> be backed up every other blue moon ...

  Another fact is CPU use.

  Does anybody really know what will be effects of intensive CPU workload on 
ZFS perfs, and effects of ZFS RAID CPU compute on intensive CPU workload ?

  I heard a story about a customer complaining about his higend server 
performances; when a guy came on site...and discover beautiful SVM RAID-5 
volumes, the solution was almost found.

 Nicolas
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-06 Thread Torrey McMahon

Roch - PAE wrote:

Thinking some more about this. If your requirements does
mandate some form of mirroring, then it truly seems that ZFS 
should take that in charge if only because of the

self-healing characteristics. So I feel the storage array's
job is to export low latency Luns to ZFS.
  



The hard part is getting a set of simple requirements. As you go into 
more complex data center environments you get hit with older Solaris 
revs, other OSs, SOX compliance issues, etc. etc. etc. The world where 
most of us seem to be playing with ZFS is on the lower end of the 
complexity scale. Sure, throw your desktop some fast SATA drives. No 
problem. Oh wait, you've got ten Oracle DBs on three E25Ks that need to 
be backed up every other blue moon ...


I agree with the general idea that an array, be it one disk or some raid 
combination, should simply export low latency LUNs. However, its the 
features offered by the array - Like site to site replication - used to 
meet more complex requirements that literally slow things down. In many 
cases you'll see years old operational procedures causing those low 
latency LUNs to slow down even more. Something really hard to get a 
customer to undo because a new fangled file system is out. ;)



I'd be happy to live with those simple Luns but I guess some
storage will just  refuse to export non-protected  luns. Now
we can definitively take advantage of the Array's capability
of exporting highly resilient Luns;  RAID-5 seems to fit the
bill  rather   well here. Even  an 9+1   luns will  be quite
resilient and have a low block overhead.
  



I think 99x0 used to do 3+1 only. Now it's 7+1 if I recall. Close enough 
I suppose.

So we benefit from the arrays resiliency as well as it's low
latency characteristics. And we mirror data at the ZFS level 
which means great performance and great data integrity and

great availability.

Note that ZFS  write characteristics (all  sequential) means
that  we will commonly be filling  full  stripes on the luns
thus avoiding the partial stripe performance pitfall.



One thing comes to mind in that case. Many arrays do sequential detect 
on the blocks that come in to the front end ports.
If things get split up to much or out of order or array characteristic here> then you could induce more latency as the 
array does cartwheels trying to figure out whats going on.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-06 Thread Roch - PAE
Wee Yeh Tan writes:
 > On 9/5/06, Torrey McMahon <[EMAIL PROTECTED]> wrote:
 > > This is simply not true. ZFS would protect against the same type of
 > > errors seen on an individual drive as it would on a pool made of HW raid
 > > LUN(s). It might be overkill to layer ZFS on top of a LUN that is
 > > already protected in some way by the devices internal RAID code but it
 > > does not "make your data susceptible to HW errors caused by the storage
 > > subsystem's RAID algorithm, and slow down the I/O".
 > 
 > & Roch's recommendation to leave at least 1 layer of redundancy to ZFS
 > allows the extension of ZFS's own redundancy features for some truely
 > remarkable data reliability.
 > 
 > Perhaps, the question should be how one could mix them to get the best
 > of both worlds instead of going to either extreme.
 > 
 > > True, ZFS can't manage past the LUN into the array. Guess what? ZFS
 > > can't get past the disk drive firmware eitherand thats a good thing
 > > for all parties involved.
 > 
 > 
 > -- 
 > Just me,
 > Wire ...
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Thinking some more about this. If your requirements does
mandate some form of mirroring, then it truly seems that ZFS 
should take that in charge if only because of the
self-healing characteristics. So I feel the storage array's
job is to export low latency Luns to ZFS.

I'd be happy to live with those simple Luns but I guess some
storage will just  refuse to export non-protected  luns. Now
we can definitively take advantage of the Array's capability
of exporting highly resilient Luns;  RAID-5 seems to fit the
bill  rather   well here. Even  an 9+1   luns will  be quite
resilient and have a low block overhead.

So we benefit from the arrays resiliency as well as it's low
latency characteristics. And we mirror data at the ZFS level 
which means great performance and great data integrity and
great availability.

Note that ZFS  write characteristics (all  sequential) means
that  we will commonly be filling  full  stripes on the luns
thus avoiding the partial stripe performance pitfall.

If you must shy away from any form of mirroring, then it's
either stripe your raid-5 luns (performance edge for those
who live dangerously) or raid-z around those raid-5 luns
(lower cost, survives lun failures).

-r

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-05 Thread Richard Elling - PAE

Jonathan Edwards wrote:
Here's 10 options I can think of to summarize combinations of zfs with 
hw redundancy:


#   ZFS ARRAY HWCAPACITYCOMMENTS
--  --- 
1   R0  R1  N/2 hw mirror - no zfs healing (XXX)
2   R0  R5  N-1 hw R5 - no zfs healing (XXX)
3   R1  2 x R0  N/2 flexible, redundant, good perf
4   R1  2 x R5  (N/2)-1 flexible, more redundant, decent perf
5   R1  1 x R5  (N-1)/2 parity and mirror on same drives (XXX)
6   RZ  R0  N-1 standard RAIDZ - no array RAID (XXX)
7   RZ  R1 (tray)   (N/2)-1 RAIDZ+1
8   RZ  R1 (drives) (N/2)-1 RAID1+Z (highest redundancy)
9   RZ  2 x R5  N-3 triple parity calculations (XXX)
10  RZ  1 x R5  N-2 double parity calculations (XXX)

If you've invested in a RAID controller on an array, you might as well 
take advantage of it, otherwise you could probably get an old D1000 
chassis somewhere and just run RAIDZ on JBOD.  


I think it would be good if RAIDoptimizer could be expanded to show these
cases, too.  Right now, the availability and performance models are simple.
To go to this level, the models get more complex and there are many more
tunables.  However, for a few representative cases, it might make sense to
do deep analysis, even if that analysis does not get translated into a
tool directly.  We have the tools to do the deep analysis, but the models
will need to be written and verified.  That said, does anyone want to see
this sort of analysis?  If so, what configurations should we do first (keep
in mind that each config may take a few hours, maybe more depending on the
performance model)
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-05 Thread Torrey McMahon

Wee Yeh Tan wrote:



Perhaps, the question should be how one could mix them to get the best
of both worlds instead of going to either extreme.


In the specific case of a 3320 I think Jonathan's chart has a lot of 
good info that can be put to use.


In the general case, well, I hate to say this but it depends. From what 
I've seen the general discussions on this list tend toward the "Make my 
small direct connected desktop/server go as fast as possible". Once you 
leave that space and move to the opposite end of the spectrum, a large 
heterogeneous datacenter, you have to start looking at the overall data 
management strategy and how different pieces of technology get 
implemented. (Site to site array replication being a good example.) 
Thats where I think you'll find more interesting cases where raid setups 
will be used with ZFS on top more then not.


There are also the speed enhancement provided by a HW raid array, and 
usually RAS too,  compared to a native disk drive but the numbers on 
that are still coming in and being analyzed. (See previous threads.)


--
Torrey McMahon
Sun Microsystems Inc.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: Re[2]: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-05 Thread Jonathan Edwards
On Sep 5, 2006, at 06:45, Robert Milkowski wrote:Hello Wee,Tuesday, September 5, 2006, 10:58:32 AM, you wrote:WYT> On 9/5/06, Torrey McMahon <[EMAIL PROTECTED]> wrote: This is simply not true. ZFS would protect against the same type oferrors seen on an individual drive as it would on a pool made of HW raidLUN(s). It might be overkill to layer ZFS on top of a LUN that isalready protected in some way by the devices internal RAID code but itdoes not "make your data susceptible to HW errors caused by the storagesubsystem's RAID algorithm, and slow down the I/O". WYT> & Roch's recommendation to leave at least 1 layer of redundancy to ZFSWYT> allows the extension of ZFS's own redundancy features for some truelyWYT> remarkable data reliability.WYT> Perhaps, the question should be how one could mix them to get the bestWYT> of both worlds instead of going to either extreme.Depends on your data but sometime it could be useful to create HW RAIDand then do just striping on ZFS side between at least two LUNs. Thatway you do not get data protection but fs/pool protection with dittoblock. Of course each LUN is HW RAID made of different physical disks.i remember working up a chart on this list about 2 months ago:Here's 10 options I can think of to summarize combinations of zfs with hw redundancy:#   ZFS     ARRAY HW        CAPACITY    COMMENTS--  ---                 1   R0      R1              N/2         hw mirror - no zfs healing (XXX)2   R0      R5              N-1         hw R5 - no zfs healing (XXX)3   R1      2 x R0          N/2         flexible, redundant, good perf4   R1      2 x R5          (N/2)-1     flexible, more redundant, decent perf5   R1      1 x R5          (N-1)/2     parity and mirror on same drives (XXX)6   RZ      R0              N-1         standard RAIDZ - no array RAID (XXX)7   RZ      R1 (tray)       (N/2)-1     RAIDZ+18   RZ      R1 (drives)     (N/2)-1     RAID1+Z (highest redundancy)9   RZ      2 x R5          N-3         triple parity calculations (XXX)10  RZ      1 x R5          N-2         double parity calculations (XXX)If you've invested in a RAID controller on an array, you might as well take advantage of it, otherwise you could probably get an old D1000 chassis somewhere and just run RAIDZ on JBOD.  If you're more concerned about redundancy than space, with the SUN/STK 3000 series dual controller arrays I would either create at least 2 x RAID5 luns balanced across controllers and zfs mirror, or create at least 4 x RAID1 luns balanced across controllers and use RAIDZ.  RAID0 isn't going to make that much sense since you've got a 128KB txg commit on zfs which isn't going to be enough to do a full stripe in most cases..je
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-05 Thread Robert Milkowski
Hello Wee,

Tuesday, September 5, 2006, 10:58:32 AM, you wrote:

WYT> On 9/5/06, Torrey McMahon <[EMAIL PROTECTED]> wrote:
>> This is simply not true. ZFS would protect against the same type of
>> errors seen on an individual drive as it would on a pool made of HW raid
>> LUN(s). It might be overkill to layer ZFS on top of a LUN that is
>> already protected in some way by the devices internal RAID code but it
>> does not "make your data susceptible to HW errors caused by the storage
>> subsystem's RAID algorithm, and slow down the I/O".

WYT> & Roch's recommendation to leave at least 1 layer of redundancy to ZFS
WYT> allows the extension of ZFS's own redundancy features for some truely
WYT> remarkable data reliability.

WYT> Perhaps, the question should be how one could mix them to get the best
WYT> of both worlds instead of going to either extreme.

Depends on your data but sometime it could be useful to create HW RAID
and then do just striping on ZFS side between at least two LUNs. That
way you do not get data protection but fs/pool protection with ditto
block. Of course each LUN is HW RAID made of different physical disks.


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-05 Thread Wee Yeh Tan

On 9/5/06, Torrey McMahon <[EMAIL PROTECTED]> wrote:

This is simply not true. ZFS would protect against the same type of
errors seen on an individual drive as it would on a pool made of HW raid
LUN(s). It might be overkill to layer ZFS on top of a LUN that is
already protected in some way by the devices internal RAID code but it
does not "make your data susceptible to HW errors caused by the storage
subsystem's RAID algorithm, and slow down the I/O".


& Roch's recommendation to leave at least 1 layer of redundancy to ZFS
allows the extension of ZFS's own redundancy features for some truely
remarkable data reliability.

Perhaps, the question should be how one could mix them to get the best
of both worlds instead of going to either extreme.


True, ZFS can't manage past the LUN into the array. Guess what? ZFS
can't get past the disk drive firmware eitherand thats a good thing
for all parties involved.



--
Just me,
Wire ...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-04 Thread Torrey McMahon

UNIX admin wrote:

My question is how efficient will ZFS be, given that
it will be layered on top of the hardware RAID and
write cache?



ZFS delivers best performance when used standalone, directly on entire disks. 
By using ZFS on top of a HW RAID, you make your data susceptible to HW errors 
caused by the storage subsystem's RAID algorithm, and slow down the I/O.
  



This is simply not true. ZFS would protect against the same type of 
errors seen on an individual drive as it would on a pool made of HW raid 
LUN(s). It might be overkill to layer ZFS on top of a LUN that is 
already protected in some way by the devices internal RAID code but it 
does not "make your data susceptible to HW errors caused by the storage 
subsystem's RAID algorithm, and slow down the I/O".


True, ZFS can't manage past the LUN into the array. Guess what? ZFS 
can't get past the disk drive firmware eitherand thats a good thing 
for all parties involved.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320 - offtopic

2006-09-04 Thread przemolicc
On Mon, Sep 04, 2006 at 01:59:53AM -0700, UNIX admin wrote:
> > My question is how efficient will ZFS be, given that
> > it will be layered on top of the hardware RAID and
> > write cache?
> 
> ZFS delivers best performance when used standalone, directly on entire disks. 
> By using ZFS on top of a HW RAID, you make your data susceptible to HW errors 
> caused by the storage subsystem's RAID algorithm, and slow down the I/O.
> 
> You should see much better performance by not creating a HW RAID, then adding 
> all the disks in the 3320' enclosures to a ZFS RAIDZ pool.

This is the case where I don't understand Sun's politics at all: Sun
doesn't offer really cheap JBOD which can be bought just for ZFS. And
don't even tell me about 3310/3320 JBODs - they are horrible expansive :-(

If Sun wants ZFS to be absorbed quicker it should have such _really_ cheap
JBOD.

przemol
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-04 Thread UNIX admin
> My question is how efficient will ZFS be, given that
> it will be layered on top of the hardware RAID and
> write cache?

ZFS delivers best performance when used standalone, directly on entire disks. 
By using ZFS on top of a HW RAID, you make your data susceptible to HW errors 
caused by the storage subsystem's RAID algorithm, and slow down the I/O.

You should see much better performance by not creating a HW RAID, then adding 
all the disks in the 3320' enclosures to a ZFS RAIDZ pool.

Additionally, given enough disks, it might be possible to squeeze even better 
performance by creating several RAIDZ vdevs and striping them. For a discussion 
on this aspect, please see "WHEN TO (AND NOT TO) USE RAID-Z" treatise at 
http://blogs.sun.com/roch/entry/when_to_and_not_to.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss