Re: [zfs-discuss] ZFS HW RAID

2009-09-19 Thread Scott Lawson



Bob Friesenhahn wrote:

On Fri, 18 Sep 2009, David Magda wrote:


If you care to keep your pool up and alive as much as possible, then 
mirroring across SAN devices is recommended.


One suggestion I heard was to get a LUN that's twice the size, and 
set copies=2. This way you have some redundancy for incorrect 
checksums.


This only helps for block-level corruption.  It does not help much at 
all if a whole LUN goes away.  It seems best for single disk rpools.
I second this. In my experience you are more likely to have a single LUN 
go missing for some reason or another and it seems most
prudent to support any production data volume with at the very minimum a 
mirror. This also give you 2 copies in a far more resilient
way generally. (and per my other post, there can be other niceties that 
come with it as well when couple with SAN based LUNS.)


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, 
http://www.simplesystems.org/users/bfriesen/

GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-19 Thread Erik Trimble
All this reminds me:   how much work (if any) has been done on the 
asyncronous mirroring option?   That is, for supporting mirrors with 
radically different access times?  (useful for supporting a mirror 
across a WAN, where you have hundred(s)-millisecond latency to the other 
side of the mirror)?


-Erik




Scott Lawson wrote:



Bob Friesenhahn wrote:

On Fri, 18 Sep 2009, David Magda wrote:


If you care to keep your pool up and alive as much as possible, 
then mirroring across SAN devices is recommended.


One suggestion I heard was to get a LUN that's twice the size, and 
set copies=2. This way you have some redundancy for incorrect 
checksums.


This only helps for block-level corruption.  It does not help much at 
all if a whole LUN goes away.  It seems best for single disk rpools.
I second this. In my experience you are more likely to have a single 
LUN go missing for some reason or another and it seems most
prudent to support any production data volume with at the very minimum 
a mirror. This also give you 2 copies in a far more resilient
way generally. (and per my other post, there can be other niceties 
that come with it as well when couple with SAN based LUNS.)


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, 
http://www.simplesystems.org/users/bfriesen/

GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-19 Thread Orvar Korvar
I asked the same question about one year ago here, and the posts poured in. 
Search for my user id? There is more info in that thread about which is best: 
ZFS vs ZFS+HWraid
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS HW RAID

2009-09-18 Thread Lloyd H. Gill

Hello folks,

I am sure this topic has been asked, but I am new to this list. I have read
a ton of doc¹s on the web, but wanted to get some opinions from you all.
Also, if someone has a digest of the last time this was discussed, you can
just send that to me. In any case, I am reading a lot of mixed reviews
related to ZFS on HW RAID devices.

The Sun docs seem to indicate it possible, but not a recommended course. I
realize there are some advantages, such as snapshots, etc. But, the h/w raid
will handle Œmost¹ disk problems, basically reducing the great capabilities
of the big reasons to deploy zfs. One suggestion would be to create the h/w
RAID LUNs as usual, present them to the OS, then do simple striping with
ZFS. Here are my two applications, where I am presented with this
possibility:

Sun Messaging Environment:
We currently use EMC storage. The storage team manages all Enterprise
storage. We currently have 10x300gb UFS mailstores presented to the OS. Each
LUN is a HW RAID 5 device. We will be upgrading the application and doing a
hardware refresh of this environment, which will give us the chance to move
to ZFS, but stay on EMC storage. I am sure the storage team will not want to
present us with JBOD. It is there practice to create the HW LUNs and present
them to the application teams. I don¹t want to end up with a complicated
scenario, but would like to leverage the most I can with ZFS, but on the EMC
array as I mentioned.

Sun Directory Environment:
The directory team is running HP DL385 G2, which also has a built-in HW RAID
controller for 5 internal SAS disks. The team currently has DS5.2 deployed
on RHEL3, but as we move to DS6.3.1, they may want to move to Solaris 10. We
have an opportunity to move to ZFS in this environment, but am curious how
to best leverage ZFS capabilities in this scenario. JBOD is very clear, but
a lot of manufacturers out there are still offering HW RAID technologies,
with high-speed caches. Using ZFS with these is not very clear to me, and as
I mentioned, there are very mixed reviews, not on ZFS features, but how it¹s
used in HW RAID settings.

Thanks for any observations.

Lloyd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread Bob Friesenhahn

On Fri, 18 Sep 2009, Lloyd H. Gill wrote:


The Sun docs seem to indicate it possible, but not a recommended course. I
realize there are some advantages, such as snapshots, etc. But, the h/w raid
will handle most disk problems, basically reducing the great capabilities
of the big reasons to deploy zfs. One suggestion would be to create the h/w
RAID LUNs as usual, present them to the OS, then do simple striping with
ZFS.


ZFS will catch issues that the H/W RAID will not.  Other than this, 
there is nothing inherently wrong with the simple striping with ZFS 
as long as you are confident about your SAN device.  If your SAN 
device fails, the whole ZFS pool may be lost, and if the failure is 
temporary, then the pool will be down until the SAN is restored.


If you care to keep your pool up and alive as much as possible, then 
mirroring across SAN devices is recommended.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread Robert Milkowski

Hi,

see comments inline:

Lloyd H. Gill wrote:


Hello folks,

I am sure this topic has been asked, but I am new to this list. I have 
read a ton of doc’s on the web, but wanted to get some opinions from 
you all. Also, if someone has a digest of the last time this was 
discussed, you can just send that to me. In any case, I am reading a 
lot of mixed reviews related to ZFS on HW RAID devices.


The Sun docs seem to indicate it possible, but not a recommended 
course. I realize there are some advantages, such as snapshots, etc. 
But, the h/w raid will handle ‘most’ disk problems, basically reducing 
the great capabilities of the big reasons to deploy zfs. One 
suggestion would be to create the h/w RAID LUNs as usual, present them 
to the OS, then do simple striping with ZFS. Here are my two 
applications, where I am presented with this possibility:


Of course you can use zfs on disk arrays with RAID done in HW, and you 
still will be able to use most of ZFS features including snapshots, 
clones, compression, etc.


It is not recommended in that sense that unless ZFS has a pool in 
redundant configuration from ZFS point of view it won't be able to heal 
corrupted blocks if they occur (but will be able to detect them). Most 
other filesystem in a market won't even detect such a case not to 
mention repair it so if you are ok with not having this great zfs 
feature then go-ahead. All the other features of zfs will work as expected.


Now, if you want to present several LUNs with RAID done in HW, then yest 
the best approach usually is to add all them to a pool in a striped 
configuration. ZFS will always put 2 or 3 copies of metadata on 
different LUNs if possible so you will end-up with some protection 
(self-healing) from zfs - for metadata at least.


Other option (more expensive) is to do raid-10 or raid-z on top of LUNs 
which are already protected with some RAID level on a disk array, so for 
example if you would present 4 luns each with RAID-5 done in HW and then 
create a pool 'zpool create test mirror lun1 lun2 mirror lun3 lun4' you 
woule effectively end-up with RAID-50 configuration but it would of 
course halve available logical storage but would allow zfs to do a 
self-healing.



Sun Messaging Environment:
We currently use EMC storage. The storage team manages all Enterprise 
storage. We currently have 10x300gb UFS mailstores presented to the 
OS. Each LUN is a HW RAID 5 device. We will be upgrading the 
application and doing a hardware refresh of this environment, which 
will give us the chance to move to ZFS, but stay on EMC storage. I am 
sure the storage team will not want to present us with JBOD. It is 
there practice to create the HW LUNs and present them to the 
application teams. I don’t want to end up with a complicated scenario, 
but would like to leverage the most I can with ZFS, but on the EMC 
array as I mentioned.



just create a pool which would stripe across such luns.





Sun Directory Environment:
The directory team is running HP DL385 G2, which also has a built-in 
HW RAID controller for 5 internal SAS disks. The team currently has 
DS5.2 deployed on RHEL3, but as we move to DS6.3.1, they may want to 
move to Solaris 10. We have an opportunity to move to ZFS in this 
environment, but am curious how to best leverage ZFS capabilities in 
this scenario. JBOD is very clear, but a lot of manufacturers out 
there are still offering HW RAID technologies, with high-speed caches. 
Using ZFS with these is not very clear to me, and as I mentioned, 
there are very mixed reviews, not on ZFS features, but how it’s used 
in HW RAID settings.


Here you have three options. RAID in HW with one LUN and then just 
create a pool on top of it. ZFS will be able to detect a corruption if 
it happens but won't be able to fix it (at least not for data).


Another option is to present each disk as RAID-0 LUN and then do a 
RAID-10 or RAID-Z in ZFS. Most RAID controllers will still use their 
cache in such a configuration so you would still benefit from it. And 
ZFS will be able to detect and fix corruption if it happens. However a 
procedure of replacing a failed disk drive could be more complicated or 
even require a downtime depending on a controller and if there is a 
management tool on solaris for it (otherwise if disk dies in many pci 
controllers with one disk in raid-0 you will have to go into its bios 
and re-create a failed disk with a new one). But check your controller 
maybe it is not an issue for you or maybe it is even acceptable approach.


The last option would be to disable RAID controller and access disk 
directly and do raid in zfs. That way you lost your cache of course.


If your applications are sensitive to a write latency to your ldap 
database that going with one of the first two options could actually 
prove to be a faster solution (assuming the volume of writes is not so 
big that a cache will be 100% utilized all the time as then it is down 
to disks).



Another thing 

Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread Scott Lawson



Lloyd H. Gill wrote:


Hello folks,

I am sure this topic has been asked, but I am new to this list. I have 
read a ton of doc's on the web, but wanted to get some opinions from 
you all. Also, if someone has a digest of the last time this was 
discussed, you can just send that to me. In any case, I am reading a 
lot of mixed reviews related to ZFS on HW RAID devices.


The Sun docs seem to indicate it possible, but not a recommended 
course. I realize there are some advantages, such as snapshots, etc. 
But, the h/w raid will handle 'most' disk problems, basically reducing 
the great capabilities of the big reasons to deploy zfs. One 
suggestion would be to create the h/w RAID LUNs as usual, present them 
to the OS, then do simple striping with ZFS. Here are my two 
applications, where I am presented with this possibility:
Comments below from me as I am a user of both of these environments, bot 
with ZFS. You may also want to check the iMS archives or subscribe to 
the list. This is

where all the Sun Messaging Server gurus hang out.  (I listen mostly ;))

List is : info-...@arnold.com and you can get more info here : 
http://mail.arnold.com/info-ims.htmlx


Sun Messaging Environment:
We currently use EMC storage. The storage team manages all Enterprise 
storage. We currently have 10x300gb UFS mailstores presented to the 
OS. Each LUN is a HW RAID 5 device. We will be upgrading the 
application and doing a hardware refresh of this environment, which 
will give us the chance to move to ZFS, but stay on EMC storage. I am 
sure the storage team will not want to present us with JBOD. It is 
there practice to create the HW LUNs and present them to the 
application teams. I don't want to end up with a complicated scenario, 
but would like to leverage the most I can with ZFS, but on the EMC 
array as I mentioned.
In this environment I do what Bob mentioned in his reply to you and that 
is I prevision two LUNS for each data volume and mirror them with ZFS. 
The LUNS are based on RAID 5
stripes on 3510's, 3511's and 6140's. Mirroring them with ZFS gives all 
of the niceties of ZFS and it will catch any of the silent data 
corruption type issues that hardware RAID
will not. My reasonings for doing this way go back to Disksuite days as 
well. (which I no longer use, ZFS or nothing pretty much these days).


My setup is based on 5 x 250 GB mirrored pairs with around 3-4 million 
messages per volume.


The two LUNS I mirror are *always* provisioned from two separate arrays 
in different data centers. This also means that in the case of a massive 
catastrophe at one
data centre, I should have a good copy from the 'mirror of last resort' 
that I can get our business back up and running on quickly.


Other advantages of this is that it also allows for relatively easy 
array maintenance and upgrades as well. ZFS only remirrors changed 
blocks rather than a complete
block re sync like disksuite does. This allows for very fast convergence 
times in the likes of file servers where change is relatively light, 
albeit continuous. Mirrors
here are super quick to re converge from my experience, a little quicker 
than RAIDZ's. ( I don't have data to back this up, just a casuall 
observation)


In some respect being both a storage guy and a systems guy. Sometimes 
the storage people need to get with the program a bit. :P If you use ZFS 
with one
of it's redundant forms (mirrors or RAIDZ's) then JBOD presentation will 
be fine.


Sun Directory Environment:
The directory team is running HP DL385 G2, which also has a built-in 
HW RAID controller for 5 internal SAS disks. The team currently has 
DS5.2 deployed on RHEL3, but as we move to DS6.3.1, they may want to 
move to Solaris 10. We have an opportunity to move to ZFS in this 
environment, but am curious how to best leverage ZFS capabilities in 
this scenario. JBOD is very clear, but a lot of manufacturers out 
there are still offering HW RAID technologies, with high-speed caches. 
Using ZFS with these is not very clear to me, and as I mentioned, 
there are very mixed reviews, not on ZFS features, but how it's used 
in HW RAID settings.
Sun Directory environment generally isn't very IO intensive, except for 
in massive data reloads or indexing operations. Other than this it is an 
ideal candidate for ZFS
and it's rather nice ARC cache. Memory is cheap on a lot of boxes and it 
will make read only type file systems fly. I imagine your actual living 
LDAP data set on disk
probably won't be larger than 10 Gigs or so? I have around 400K objects 
in mine and it's only about 2 Gigs or so including all our indexes. I 
tend to tune DS up
so that everything it needs is in RAM anyway. As far as diectory server 
goes, are you using the 64 bit version on Linux? If not you should be as 
well.


Thanks for any observations.

Lloyd


___
zfs-discuss mailing list

Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread Robert Milkowski

Scott Lawson wrote:
Sun Directory environment generally isn't very IO intensive, except 
for in massive data reloads or indexing operations. Other than this it 
is an ideal candidate for ZFS
and it's rather nice ARC cache. Memory is cheap on a lot of boxes and 
it will make read only type file systems fly. I imagine your actual 
living LDAP data set on disk
probably won't be larger than 10 Gigs or so? I have around 400K 
objects in mine and it's only about 2 Gigs or so including all our 
indexes. I tend to tune DS up
so that everything it needs is in RAM anyway. As far as diectory 
server goes, are you using the 64 bit version on Linux? If not you 
should be as well.




From my experience enabling lzjb comprssion for DS makes it even faster 
and reduces disk usage by about 2x.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread David Magda

On Sep 18, 2009, at 16:52, Bob Friesenhahn wrote:

If you care to keep your pool up and alive as much as possible, then  
mirroring across SAN devices is recommended.


One suggestion I heard was to get a LUN that's twice the size, and set  
copies=2. This way you have some redundancy for incorrect checksums.


Haven't done it myself.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS HW RAID

2009-09-18 Thread Bob Friesenhahn

On Fri, 18 Sep 2009, David Magda wrote:


If you care to keep your pool up and alive as much as possible, then 
mirroring across SAN devices is recommended.


One suggestion I heard was to get a LUN that's twice the size, and set 
copies=2. This way you have some redundancy for incorrect checksums.


This only helps for block-level corruption.  It does not help much at 
all if a whole LUN goes away.  It seems best for single disk rpools.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss