Re: [zfs-discuss] ZFS + L2ARC + Cluster.

2010-11-16 Thread Richard Elling
comment below...

On Nov 15, 2010, at 4:21 PM, Matt Banks wrote:
> 
> On Nov 15, 2010, at 4:15 PM, Erik Trimble wrote:
> 
>> On 11/15/2010 2:55 PM, Matt Banks wrote:
>>> I asked this on the x86 mailing list (and got a "it should work" answer), 
>>> but this is probably more of the appropriate place for it.
>>> 
>>> In a 2 node Sun Cluster (3.2 running Solaris 10 u8, but could be running u9 
>>> if needed), we're looking at moving from VXFS to ZFS.  However, quite 
>>> frankly, part of the rationale is L2ARC.  Would it be possible to use 
>>> internal (SSD) storage for the L2ARC in such a scenario?  My understanding 
>>> is that if a ZFS filesystem is passed from one node to another, the L2ARC 
>>> has to be rebuilt.  So, why can't it just be rebuilt on internal storage?
>>> 
>>> The nodes (x4240's) are identical and would have identical storage 
>>> installed, so the paths would be the same.
>>> 
>>> Has anyone done anything similar to this?  I'd love something more than "it 
>>> should work" before dropping $25k on SSD's...
>>> 
>>> TIA,
>>> matt
>> 
>> If your SSD is part of the shared storage (and, thus, visible from both 
>> nodes), then it will be part of the whole pool when exported/imported by the 
>> cluster failover software.
>> 
>> If, on the other hand, you have an SSD in each node that is attached to the 
>> shared-storage as L2ARC, then it's not visible to the other node, and the 
>> L2ARC would have to be reattached and rebuilt in a failover senario.
>> 
>> 
>> 
>> If you are using only X4240 systems ONLY, then you don't have ANY shared 
>> storage - ZFS isn't going to be able to "failover" between the two nodes.  
>> You'd have to mirror the data between the two nodes somehow; they wouldn't 
>> be part of the same zpool.
>> 
>> 
>> Really, what you want is something like a J4000-series dual-attached to both 
>> X4240, with SSDs and HDs installed in the J4000-series chassis, not in the 
>> X4240s.
> 
> 
> 
> Believe you me, had the standalone j4x00's not been EOL'd on 24-Sept-10 (and 
> if they supported SSD's), or if the 2540's/2501 we have attached to this 
> cluster supported SSD's, that would be my first choice (honestly, I LOVE the 
> j4x00's - we get great performance out of them every time we've installed 
> them - better at times than 2540's or 6180's).  However, at this point, the 
> only real choice we seem to have for external storage from Oracle is an F5100 
> or stepping up to a 6580 with a CSM2 or a 7120.  The 6580 obviously ain't 
> gonna happen and a 7120 leaves us with NFS and NFS+Solaris+Intersystems Caché 
> has massive performance issues.  The F5100 may be an option, but I'd like to 
> explore this first.
> 
> (In the interest of complete description of this particular configuration: we 
> have 2x 2540's - one of which has a 2501 attached to it - attached to 2x 
> x4240's.  The 2540's are entirely populated with SATA 7200k rpm drives.  The 
> external file systems are VXFS at this point and managed by Volume Manager 
> and have been in production for well over a year.  When these systems were 
> installed, ZFS still wasn't an option for us.)
> 
> I'm OK having to rebuild the L2ARC cache in case of a failover.  

The L2ARC is rebuilt any time the pool is imported.  If the L2ARC devices are 
not
found, then the pool is still ok, but will be listed as degraded (see the 
definition of 
degraded in the zpool man page).  This is harmless from a data protection 
viewpoint,
though if you intend to run that way for a long time, you might just remove the 
L2ARC
from the pool.

In the case of clusters with the L2ARC unshared, we do support this under 
NexentaStor 
HA-Cluster and it is a fairly common case. I can't speak for what Oracle can 
"support."
 -- richard

> They don't happen often.  And it's not like this is entirely unprecedented.  
> This is exactly the model Oracle uses for the 7000 series storage with 
> cluster nodes.  The "readzillas" (or whatever they're called now) are in the 
> cluster nodes - meaning if one fails, the other takes over and has to rebuild 
> its L2ARC.
> 
> I'm talking about having an SSD (or more, but let's say 1 for simplicity's 
> sake) in each of the x4240's.  One is sitting unused in node b waiting for 
> node a to fail.  Node a's SSD is in use as L2ARC.  Then, node a fails, the 
> ZFS file systems fail over, and then node b's SSD (located at the same path 
> as it was in node a) is used as L2ARC for the failed over file system.
> 
> The $2,400 for two Marlin SSD's is a LOT less money than the $47k (incl 
> HBA's) the "lowend" F5100 would run (MSRP).
> 
> matt
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + L2ARC + Cluster.

2010-11-15 Thread Matt Banks

On Nov 15, 2010, at 4:15 PM, Erik Trimble wrote:

> On 11/15/2010 2:55 PM, Matt Banks wrote:
>> I asked this on the x86 mailing list (and got a "it should work" answer), 
>> but this is probably more of the appropriate place for it.
>> 
>> In a 2 node Sun Cluster (3.2 running Solaris 10 u8, but could be running u9 
>> if needed), we're looking at moving from VXFS to ZFS.  However, quite 
>> frankly, part of the rationale is L2ARC.  Would it be possible to use 
>> internal (SSD) storage for the L2ARC in such a scenario?  My understanding 
>> is that if a ZFS filesystem is passed from one node to another, the L2ARC 
>> has to be rebuilt.  So, why can't it just be rebuilt on internal storage?
>> 
>> The nodes (x4240's) are identical and would have identical storage 
>> installed, so the paths would be the same.
>> 
>> Has anyone done anything similar to this?  I'd love something more than "it 
>> should work" before dropping $25k on SSD's...
>> 
>> TIA,
>> matt
> 
> If your SSD is part of the shared storage (and, thus, visible from both 
> nodes), then it will be part of the whole pool when exported/imported by the 
> cluster failover software.
> 
> If, on the other hand, you have an SSD in each node that is attached to the 
> shared-storage as L2ARC, then it's not visible to the other node, and the 
> L2ARC would have to be reattached and rebuilt in a failover senario.
> 
> 
> 
> If you are using only X4240 systems ONLY, then you don't have ANY shared 
> storage - ZFS isn't going to be able to "failover" between the two nodes.  
> You'd have to mirror the data between the two nodes somehow; they wouldn't be 
> part of the same zpool.
> 
> 
> Really, what you want is something like a J4000-series dual-attached to both 
> X4240, with SSDs and HDs installed in the J4000-series chassis, not in the 
> X4240s.



Believe you me, had the standalone j4x00's not been EOL'd on 24-Sept-10 (and if 
they supported SSD's), or if the 2540's/2501 we have attached to this cluster 
supported SSD's, that would be my first choice (honestly, I LOVE the j4x00's - 
we get great performance out of them every time we've installed them - better 
at times than 2540's or 6180's).  However, at this point, the only real choice 
we seem to have for external storage from Oracle is an F5100 or stepping up to 
a 6580 with a CSM2 or a 7120.  The 6580 obviously ain't gonna happen and a 7120 
leaves us with NFS and NFS+Solaris+Intersystems Caché has massive performance 
issues.  The F5100 may be an option, but I'd like to explore this first.

(In the interest of complete description of this particular configuration: we 
have 2x 2540's - one of which has a 2501 attached to it - attached to 2x 
x4240's.  The 2540's are entirely populated with SATA 7200k rpm drives.  The 
external file systems are VXFS at this point and managed by Volume Manager and 
have been in production for well over a year.  When these systems were 
installed, ZFS still wasn't an option for us.)

I'm OK having to rebuild the L2ARC cache in case of a failover.  They don't 
happen often.  And it's not like this is entirely unprecedented.  This is 
exactly the model Oracle uses for the 7000 series storage with cluster nodes.  
The "readzillas" (or whatever they're called now) are in the cluster nodes - 
meaning if one fails, the other takes over and has to rebuild its L2ARC.

I'm talking about having an SSD (or more, but let's say 1 for simplicity's 
sake) in each of the x4240's.  One is sitting unused in node b waiting for node 
a to fail.  Node a's SSD is in use as L2ARC.  Then, node a fails, the ZFS file 
systems fail over, and then node b's SSD (located at the same path as it was in 
node a) is used as L2ARC for the failed over file system.

The $2,400 for two Marlin SSD's is a LOT less money than the $47k (incl HBA's) 
the "lowend" F5100 would run (MSRP).

matt___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + L2ARC + Cluster.

2010-11-15 Thread Erik Trimble

On 11/15/2010 2:55 PM, Matt Banks wrote:

I asked this on the x86 mailing list (and got a "it should work" answer), but 
this is probably more of the appropriate place for it.

In a 2 node Sun Cluster (3.2 running Solaris 10 u8, but could be running u9 if 
needed), we're looking at moving from VXFS to ZFS.  However, quite frankly, 
part of the rationale is L2ARC.  Would it be possible to use internal (SSD) 
storage for the L2ARC in such a scenario?  My understanding is that if a ZFS 
filesystem is passed from one node to another, the L2ARC has to be rebuilt.  
So, why can't it just be rebuilt on internal storage?

The nodes (x4240's) are identical and would have identical storage installed, 
so the paths would be the same.

Has anyone done anything similar to this?  I'd love something more than "it should 
work" before dropping $25k on SSD's...

TIA,
matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


If your SSD is part of the shared storage (and, thus, visible from both 
nodes), then it will be part of the whole pool when exported/imported by 
the cluster failover software.


If, on the other hand, you have an SSD in each node that is attached to 
the shared-storage as L2ARC, then it's not visible to the other node, 
and the L2ARC would have to be reattached and rebuilt in a failover senario.




If you are using only X4240 systems ONLY, then you don't have ANY shared 
storage - ZFS isn't going to be able to "failover" between the two 
nodes.  You'd have to mirror the data between the two nodes somehow; 
they wouldn't be part of the same zpool.



Really, what you want is something like a J4000-series dual-attached to 
both X4240, with SSDs and HDs installed in the J4000-series chassis, not 
in the X4240s.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS + L2ARC + Cluster.

2010-11-15 Thread Matt Banks
I asked this on the x86 mailing list (and got a "it should work" answer), but 
this is probably more of the appropriate place for it.

In a 2 node Sun Cluster (3.2 running Solaris 10 u8, but could be running u9 if 
needed), we're looking at moving from VXFS to ZFS.  However, quite frankly, 
part of the rationale is L2ARC.  Would it be possible to use internal (SSD) 
storage for the L2ARC in such a scenario?  My understanding is that if a ZFS 
filesystem is passed from one node to another, the L2ARC has to be rebuilt.  
So, why can't it just be rebuilt on internal storage?

The nodes (x4240's) are identical and would have identical storage installed, 
so the paths would be the same.

Has anyone done anything similar to this?  I'd love something more than "it 
should work" before dropping $25k on SSD's...

TIA,
matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss