Re: [zfs-discuss] HP ProLiant N36L

2010-11-16 Thread Krist van Besien
I can now confirm that NexentaCore runs without a hitch on the N36L
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-16 Thread Rthoreau
Darren J Moffat  writes:

> On 11/15/10 19:36, David Magda wrote:
>
>>> Using ZFS encryption support can be as easy as this:
>>>
>>>   # zfs create -o encryption=on tank/darren
>>>   Enter passphrase for 'tank/darren':
>>>   Enter again:
>>
>
>
>>   2. Both CCM and GCM modes of operation are supported: can you recommended
>> which mode should be used when? I'm guessing it's best to accept the
>> default if you're not sure, but what if we want to expand our knowledge?
>
> You've preempted my next planned posting ;-)  But I'll attempt to give
> an answer here:
>
> 'on' maps to aes-128-ccm, because it is the fastest of the 6 available
> modes of encryption currently provided.  Also I believe it is the
> current wisdom of cryptographers (which I do not claim to be) that AES
> 128 is the preferred key length due to recent discoveries about AES
> 256 that are not know to impact AES 128.
>
> Both CCM[1] and GCM[2] are provided so that if one turns out to have
> flaws hopefully the other will still be available for use safely even
> though they are roughly similar styles of modes.
>
> On systems without hardware/cpu support for Galios multiplication
> (Intel Westmere and later and SPARC T3 and later) GCM will be slower
> because the Galios field multiplication has to happen in software
> without any hardware/cpu assist.  However depending on your workload
> you might not even notice the difference.
>
> One reason you may want to select aes-128-gcm rather than aes-128-ccm
> is that GCM is one of the modes for AES in NSA Suite B[3], but CCM is
> not.
>
> Are there symmetric algorithms other than AES that are of interest ?
> The wrapping key algorithm currently matches the data encryption key
> algorithm, is there interest in providing different wrapping key
> algorithms and configuration properties for selecting which one ?  For
> example doing key wrapping with an RSA keypair/certificate ?
>
> [1] http://en.wikipedia.org/wiki/CCM_mode
> [2] http://en.wikipedia.org/wiki/Galois/Counter_Mode
> [3] http://en.wikipedia.org/wiki/NSA_Suite_B_Cryptography

I appreciate all the hard work the ZFS team and yourself have done to
making this happen. I think a lot of people are going to give this a try
but I noticed that one of the license restrictions was not to run
benchmarks without prior permission from Oracle.  Is Oracle going to
post some benchmarks that might give people an idea of the performance
using the various key lengths? Or even the performance benefit of using
the newer processors with hardware support?

I think a few graphs and testing procedures would be great this might be
an opportunity to convince people the benefit of using sparc and Oracle
hardware while at the same time giving people a basic idea what it could
do for them on their own systems. I would also go as far as saying that
some people would not even know how to setup a baseline to get
comparative test results while using encryption.

I could imagine a lot of people are curious about every aspect of
performance and are thinking is ZFS encryption ready
for production. I just think that some people might need that little
extra nudge that a few graphs and test would provide. If it happens to
also come with a few good practices you could save a lot of people some
time and heart ache as I am sure people are desirous to see the results.

Rthoreau

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import is this safe to use "-f" option in this case ?

2010-11-16 Thread Jim Dunham
Tim,

> 
> On Wed, Nov 17, 2010 at 10:12 AM, Jim Dunham  wrote:
> sridhar,
> 
> > I have done the following (which is required for my case)
> >
> > Created a zpool (smpool) on a device/LUN from an array (IBM 6K) on host1
> > created a array level snapshot of the device using "dscli" to another 
> > device which is successful.
> > Now I make the snapshot device visible to another host (host2)
> 
> Even though the array is capable of taking device/LUN snapshots, this is a 
> non-standard mode of operation regarding the use of ZFS.
> 
> It raises concerns that if one had a problem using a ZFS in this manner, 
> there would be few Oracle or community users of ZFS that could assist. Even 
> if the alleged problem was not related to using ZFS with array based 
> snapshots, usage would always create a level of uncertainty.
> 
> Instead I would suggest using ZFS send / recv instead.
> 
> 
> That's what we call FUD.  "It might be a problem if you use someone else's 
> feature that we duplicate".  If Oracle isn't going to support array-based 
> snapshots, come right out and say it.  You might as well pack up the cart now 
> though, there isn't an enterprise array on the market that doesn't have 
> snapshots, and you will be the ONLY OS I've ever heard of even suggesting 
> that array-based snapshots aren't allowed.

That's not what I said... Non-standard mode of operation is not the same thing 
as not supported. Using ZFS's standard mode of operation based on its built-in 
support for snapshots is well proven, well document technology. 

> 
>  
> > would there be any issues ?
> 
> Prior to taking the next snapshot, one must be assured that the device/LUN on 
> host2 is returned to the "zpool export" state. Failure to do this could cause 
> zpool corruption, ZFS I/O failures, or even the possibility of a system panic 
> on host2.
> 
> 
> Really?  And how did you come to that conclusion?  

As prior developer and project lead of host-based snapshot and replication 
software on Solaris, I have first hand experience using ZFS with snapshots.

If while ZFS on node2 is accessing an instance of snapshot data, the array 
updates the snapshot data, ZFS will see newly created CRCs created by node1. 
These CRCs will be considered as metadata corruption, and depending on exactly 
what ZFS was doing at the time the corruption was detected, the software 
attempt some form of error recovery.

> OP: Yes, you do need to use a -f.  The zpool has a signature that is there 
> when the pool is imported (this is to keep an admin from accidentally 
> importing the pool to two different systems at the same time).  The only way 
> to clear it is to do a zpool export before taking the initial snapshot, or 
> doing the -f on import.  Jim here is doing a great job of spreading FUD, and 
> none of it is true.
> 
> What you're doing should absolutely work, just make sure there is no I/O in 
> flight when you take the original snapshot.  
> 
> Either export the pool first (I would recommend this approach), shut the 
> system down, or just make sure you aren't doing any writes when taking the 
> array-based snapshot.

These last two statements need clarification. 

ZFS is always on disk consistent, even in the context of using snapshots. 
Therefore as far as ZFS is concerned, there is no need to assure that there are 
no I/Os in flight, or that the storage pool is exported, or that the system is 
shutdown, or that one is doing any writes.

Although ZFS is always on disk consistent, many applications are not filesystem 
consistent. To be filesystem consistent, an application by design must issue 
careful writes and/or synchronized filesystem operations. Not knowing this 
fact, or lacking this functionality, a system admin will need to deploy some of 
the work-arounds suggested above. The most important one not listed, is to stop 
or pause those applications which are know not to be filesystem consistent.

- Jim

> 
> --Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread jason
I've done mpxio over multiple ip links in linux using multipathd.  Works just 
fine.  It's not part of the initiator but accomplishes the same thing.

It was a linux IET target.  Need to try it here with a COMSTAR target.

-Original Message-
From: Ross Walker 
Sender: zfs-discuss-boun...@opensolaris.org
Date: Tue, 16 Nov 2010 22:05:05 
To: Jim Dunham
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread Ross Walker
On Nov 16, 2010, at 7:49 PM, Jim Dunham  wrote:

> On Nov 16, 2010, at 6:37 PM, Ross Walker wrote:
>> On Nov 16, 2010, at 4:04 PM, Tim Cook  wrote:
>>> AFAIK, esx/i doesn't support L4 hash, so that's a non-starter.
>> 
>> For iSCSI one just needs to have a second (third or fourth...) iSCSI session 
>> on a different IP to the target and run mpio/mpxio/mpath whatever your OS 
>> calls multi-pathing.
> 
> MC/S (Multiple Connections per Sessions) support was added to the iSCSI 
> Target in COMSTAR, now available in Oracle Solaris 11 Express. 

Good to know.

The only initiator I know of that supports that is Windows, but with MC/S one 
at least doesn't need MPIO as the initiator handles the multiplexing over the 
multiple connections itself.

Doing multiple sessions and MPIO is supported almost universally though.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Adding Sun Flash Accelerator F20's into a Zpool for Optimal Performance [SEC=UNCLASSIFIED]

2010-11-16 Thread Bob Friesenhahn

On Wed, 17 Nov 2010, LEES, Cooper wrote:


Zfs Gods,

I have been approved to buy 2 x F20 PCIe cards for my x4540 to 
increase our IOPs and I was wondering what would be the most benefit 
to gain extra IOPs (both reading and writing) on my zpool.


To clarify, adding a dedicated intent log (slog) only improves 
apparent IOPS for synchronous writes such as via NFS or a database. 
It will not help async writes at all unless they are contending with 
sync writes.  A l2arc device will help with read IOPS quite a lot 
provided that the working set is larger than system RAM yet smaller 
than the l2arc device.  If the working set is still much larger than 
RAM plus l2arc devices, then read performance may still be 
bottlenecked by disk.


Take care not to trade IOPS gains for a data rate throughput loss. 
Sometimes cache devices offer less throughput than main store.


There is little doubt that your pool would support more IOPS if it was 
based on more vdevs, containing fewer drives each.


I doubt that anyone here can adequately answer your question without 
measurement data from the system taken while it is under the expected 
load.


Useful tools for producing data to look at are the zilstat.ksh and 
arc_summary.pl scripts which you should find mentioned in the list 
archives.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread Jim Dunham
On Nov 16, 2010, at 6:37 PM, Ross Walker wrote:
> On Nov 16, 2010, at 4:04 PM, Tim Cook  wrote:
>> AFAIK, esx/i doesn't support L4 hash, so that's a non-starter.
> 
> For iSCSI one just needs to have a second (third or fourth...) iSCSI session 
> on a different IP to the target and run mpio/mpxio/mpath whatever your OS 
> calls multi-pathing.

MC/S (Multiple Connections per Sessions) support was added to the iSCSI Target 
in COMSTAR, now available in Oracle Solaris 11 Express. 

- Jim

> -Ross
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import is this safe to use "-f" option in this case ?

2010-11-16 Thread Tim Cook
On Wed, Nov 17, 2010 at 10:12 AM, Jim Dunham wrote:

> sridhar,
>
> > I have done the following (which is required for my case)
> >
> > Created a zpool (smpool) on a device/LUN from an array (IBM 6K) on host1
> > created a array level snapshot of the device using "dscli" to another
> device which is successful.
> > Now I make the snapshot device visible to another host (host2)
>
> Even though the array is capable of taking device/LUN snapshots, this is a
> non-standard mode of operation regarding the use of ZFS.
>
> It raises concerns that if one had a problem using a ZFS in this manner,
> there would be few Oracle or community users of ZFS that could assist. Even
> if the alleged problem was not related to using ZFS with array based
> snapshots, usage would always create a level of uncertainty.
>
> Instead I would suggest using ZFS send / recv instead.
>
>
That's what we call FUD.  "It might be a problem if you use someone else's
feature that we duplicate".  If Oracle isn't going to support array-based
snapshots, come right out and say it.  You might as well pack up the cart
now though, there isn't an enterprise array on the market that doesn't have
snapshots, and you will be the ONLY OS I've ever heard of even suggesting
that array-based snapshots aren't allowed.



> > would there be any issues ?
>
> Prior to taking the next snapshot, one must be assured that the device/LUN
> on host2 is returned to the "zpool export" state. Failure to do this could
> cause zpool corruption, ZFS I/O failures, or even the possibility of a
> system panic on host2.
>
>
Really?  And how did you come to that conclusion?



OP: Yes, you do need to use a -f.  The zpool has a signature that is there
when the pool is imported (this is to keep an admin from accidentally
importing the pool to two different systems at the same time).  The only way
to clear it is to do a zpool export before taking the initial snapshot, or
doing the -f on import.  Jim here is doing a great job of spreading FUD, and
none of it is true.  What you're doing should absolutely work, just make
sure there is no I/O in flight when you take the original snapshot.

Either export the pool first (I would recommend this approach), shut the
system down, or just make sure you aren't doing any writes when taking the
array-based snapshot.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread Ross Walker
On Nov 16, 2010, at 4:04 PM, Tim Cook  wrote:

> 
> 
> On Wed, Nov 17, 2010 at 7:56 AM, Miles Nordin  wrote:
> > "tc" == Tim Cook  writes:
> 
>tc> Channeling Ethernet will not make it any faster. Each
>tc> individual connection will be limited to 1gbit.  iSCSI with
>tc> mpxio may work, nfs will not.
> 
> well...probably you will run into this problem, but it's not
> necessarily totally unsolved.
> 
> I am just regurgitating this list again, but:
> 
>  need to include L4 port number in the hash:
>  
> http://www.cisco.com/en/US/products/ps9336/products_tech_note09186a0080a963a9.shtml#eclb
>  port-channel load-balance mixed  -- for L2 etherchannels
>  mls ip cef load-sharing full -- for L3 routing (OSPF ECMP)
> 
>  nexus makes all this more complicated.  there are a few ways that
>  seem they'd be able to accomplish ECMP:
>   FTag flow markers in ``FabricPath'' L2 forwarding
>   LISP
>   MPLS
>  the basic scheme is that the L4 hash is performed only by the edge
>  router and used to calculate a label.  The routing protocol will
>  either do per-hop ECMP (FabricPath / IS-IS) or possibly some kind of
>  per-entire-path ECMP for LISP and MPLS.  unfortunately I don't
>  understand these tools well enoguh to lead you further, but if
>  you're not using infiniband and want to do >10way ECMP this is
>  probably where you need to look.
> 
>  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6817942
>  feature added in snv_117, NFS client connections can be spread over multiple 
> TCP connections
>  When rpcmod:clnt_max_conns is set to a value > 1
>  however Even though the server is free to return data on different
>  connections, [it does not seem to choose to actually do so] --
>  6696163 fixed snv_117
> 
>  nfs:nfs3_max_threads=32
>  in /etc/system, which changes the default 8 async threads per mount to
>  32.  This is especially helpful for NFS over 10Gb and sun4v
> 
>  this stuff gets your NFS traffic onto multiple TCP circuits, which
>  is the same thing iSCSI multipath would accomplish.  From there, you
>  still need to do the cisco/??? stuff above to get TCP circuits
>  spread across physical paths.
> 
>  
> http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html
>-- suspect.  it advises ``just buy 10gig'' but many other places
>   say 10G NIC's don't perform well in real multi-core machines
>   unless you have at least as many TCP streams as cores, which is
>   honestly kind of obvious.  lego-netadmin bias.
> 
> 
> 
> AFAIK, esx/i doesn't support L4 hash, so that's a non-starter.

For iSCSI one just needs to have a second (third or fourth...) iSCSI session on 
a different IP to the target and run mpio/mpxio/mpath whatever your OS calls 
multi-pathing.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import is this safe to use "-f" option in this case ?

2010-11-16 Thread Jim Dunham
sridhar,

> I have done the following (which is required for my case)
> 
> Created a zpool (smpool) on a device/LUN from an array (IBM 6K) on host1
> created a array level snapshot of the device using "dscli" to another device 
> which is successful.
> Now I make the snapshot device visible to another host (host2)

Even though the array is capable of taking device/LUN snapshots, this is a 
non-standard mode of operation regarding the use of ZFS.

It raises concerns that if one had a problem using a ZFS in this manner, there 
would be few Oracle or community users of ZFS that could assist. Even if the 
alleged problem was not related to using ZFS with array based snapshots, usage 
would always create a level of uncertainty. 

Instead I would suggest using ZFS send / recv instead.

> I tried "zpool import smpool". Got a warning message that host1 is using this 
> pool (might be the smpool metata data has stored this info) and asked to use 
> "-f"
> 
> When i tried zpool import with -f option, I am able to successfully import to 
> host2 and able to access all file systems and snapshots. 
> 
> My query is in this scenario is always safe to use "-f" to import ??

In this scenario, it is safe to use "-f" with zpool import. 

> would there be any issues ? 

Prior to taking the next snapshot, one must be assured that the device/LUN on 
host2 is returned to the "zpool export" state. Failure to do this could cause 
zpool corruption, ZFS I/O failures, or even the possibility of a system panic 
on host2.

> Also I have observed that zpool import took some time to for successful 
> completion. Is there a way minimize "zpool import -f" operation time ??

No.

- Jim

> 
> Regards,
> sridhar.
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Adding Sun Flash Accelerator F20's into a Zpool for Optimal Performance [SEC=UNCLASSIFIED]

2010-11-16 Thread LEES, Cooper
Zfs Gods,

I have been approved to buy 2 x F20 PCIe cards for my x4540 to increase our
IOPs and I was wondering what would be the most benefit to gain extra IOPs
(both reading and writing) on my zpool.

Currently I have to following storage zpool, called cesspool:

  pool: cesspool
 state: ONLINE
 scrub: scrub completed after 14h0m with 0 errors on Sat Nov 13 18:11:29
2010
config:

NAME   STATE READ WRITE CKSUM
cesspool   ONLINE   0 0 0
  raidz2-0 ONLINE   0 0 0
c10t0d0p0  ONLINE   0 0 0
c11t0d0p0  ONLINE   0 0 0
c12t0d0p0  ONLINE   0 0 0
c13t0d0p0  ONLINE   0 0 0
c8t1d0p0   ONLINE   0 0 0
c9t1d0p0   ONLINE   0 0 0
c10t1d0p0  ONLINE   0 0 0
c11t1d0p0  ONLINE   0 0 0
c12t1d0p0  ONLINE   0 0 0
c13t1d0p0  ONLINE   0 0 0
c8t2d0p0   ONLINE   0 0 0
  raidz2-1 ONLINE   0 0 0
c9t2d0p0   ONLINE   0 0 0
c10t2d0p0  ONLINE   0 0 0
c11t2d0p0  ONLINE   0 0 0
c12t2d0p0  ONLINE   0 0 0
c13t2d0p0  ONLINE   0 0 0
c8t3d0p0   ONLINE   0 0 0
c9t3d0p0   ONLINE   0 0 0
c10t3d0p0  ONLINE   0 0 0
c11t3d0p0  ONLINE   0 0 0
c12t3d0p0  ONLINE   0 0 0
c13t3d0p0  ONLINE   0 0 0
  raidz2-2 ONLINE   0 0 0
c8t4d0p0   ONLINE   0 0 0
c9t4d0p0   ONLINE   0 0 0
c10t4d0p0  ONLINE   0 0 0
c11t4d0p0  ONLINE   0 0 0
c12t4d0p0  ONLINE   0 0 0
c13t4d0p0  ONLINE   0 0 0
c8t5d0p0   ONLINE   0 0 0
c9t5d0p0   ONLINE   0 0 0
c10t5d0p0  ONLINE   0 0 0
c11t5d0p0  ONLINE   0 0 0
c12t5d0p0  ONLINE   0 0 0
  raidz2-3 ONLINE   0 0 0
c13t5d0p0  ONLINE   0 0 0
c8t6d0p0   ONLINE   0 0 0
c9t6d0p0   ONLINE   0 0 0
c10t6d0p0  ONLINE   0 0 0
c11t6d0p0  ONLINE   0 0 0
c12t7d0p0  ONLINE   0 0 0
c13t6d0p0  ONLINE   0 0 0
c8t7d0p0   ONLINE   0 0 0
c9t7d0p0   ONLINE   0 0 0
c10t7d0p0  ONLINE   0 0 0
c11t7d0p0  ONLINE   0 0 0
spares
  c12t6d0p0AVAIL
  c13t7d0p0AVAIL

As you would imagine with that setup, it¹s IOPs are nothing to write home
about. No slogging or cache devices. If I get 2 x F20 PCIe cards, how would
you recommend I use them for most benefit? I was thinking paritioning the
two drives that show up form the F20, have a mirrored slogging zpool (named
slogger) and use the other two vdevs to go into cesspool as cache devices.
Or am I better to use 1 device soley for slogging and one for a cache devce
in my pool (cesspool) ... Cause if I loose the cache device the pool still
operates but slows back down? (Am I correct there?)

I am getting an outage on the prod system in December, but I will test them
in my backup x4500 (if I can) before the cut over on the prod system. I will
also be looking to go to the latest firmware and possibly (depending on
costs ­ awaiting quote from our Sun/Oracle supplier) Solaris 11 Express ...

Thanks, will appreciate any thoughts.
--
Cooper Ry Lees
HPC / UNIX Systems Administrator - Information Management Services (IMS)
Australian Nuclear Science and Technology Organisation
T  +61 2 9717 3853
F  +61 2 9717 9273
M  +61 403 739 446
E  cooper.l...@ansto.gov.au
www.ansto.gov.au 

Important: This transmission is intended only for the use of the addressee.
It is confidential and may contain privileged information or copyright
material. If you are not the intended recipient, any use or further
disclosure of this communication is strictly forbidden. If you have received
this transmission in error, please notify me immediately by telephone and
delete all copies of this transmission as well as any attachments.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Any opinions on the Brocade 825 Dual port 8Gb FC HBA?

2010-11-16 Thread Kyle McDonald
Does OpenSolaris/Solaris11 Express have a driver for it already?

Anyone used one already?

 -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread Tim Cook
On Wed, Nov 17, 2010 at 7:56 AM, Miles Nordin  wrote:

> > "tc" == Tim Cook  writes:
>
>tc> Channeling Ethernet will not make it any faster. Each
>tc> individual connection will be limited to 1gbit.  iSCSI with
>tc> mpxio may work, nfs will not.
>
> well...probably you will run into this problem, but it's not
> necessarily totally unsolved.
>
> I am just regurgitating this list again, but:
>
>  need to include L4 port number in the hash:
>
> http://www.cisco.com/en/US/products/ps9336/products_tech_note09186a0080a963a9.shtml#eclb
>  port-channel load-balance mixed  -- for L2 etherchannels
>  mls ip cef load-sharing full -- for L3 routing (OSPF ECMP)
>
>  nexus makes all this more complicated.  there are a few ways that
>  seem they'd be able to accomplish ECMP:
>   FTag flow markers in ``FabricPath'' L2 forwarding
>   LISP
>   MPLS
>  the basic scheme is that the L4 hash is performed only by the edge
>  router and used to calculate a label.  The routing protocol will
>  either do per-hop ECMP (FabricPath / IS-IS) or possibly some kind of
>  per-entire-path ECMP for LISP and MPLS.  unfortunately I don't
>  understand these tools well enoguh to lead you further, but if
>  you're not using infiniband and want to do >10way ECMP this is
>  probably where you need to look.
>
>  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6817942
>  feature added in snv_117, NFS client connections can be spread over
> multiple TCP connections
>  When rpcmod:clnt_max_conns is set to a value > 1
>  however Even though the server is free to return data on different
>  connections, [it does not seem to choose to actually do so] --
>  6696163 fixed snv_117
>
>  nfs:nfs3_max_threads=32
>  in /etc/system, which changes the default 8 async threads per mount to
>  32.  This is especially helpful for NFS over 10Gb and sun4v
>
>  this stuff gets your NFS traffic onto multiple TCP circuits, which
>  is the same thing iSCSI multipath would accomplish.  From there, you
>  still need to do the cisco/??? stuff above to get TCP circuits
>  spread across physical paths.
>
>
> http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html
>-- suspect.  it advises ``just buy 10gig'' but many other places
>   say 10G NIC's don't perform well in real multi-core machines
>   unless you have at least as many TCP streams as cores, which is
>   honestly kind of obvious.  lego-netadmin bias.
>



AFAIK, esx/i doesn't support L4 hash, so that's a non-starter.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideas for ghetto file server data reliability?

2010-11-16 Thread michael . p . sullivan
Ummm… there's a difference between data integrity and data corruption.

Integrity is enforced programmatically by something like a DBMS.  This sets up 
basic rules that ensure the programmer, program or algorithm adhere to a level 
of sanity and bounds.

Corruption is where cosmic rays, bit rot, malware or some other item writes to 
the block level.  ZFS protects systems from a lot of this by the way it's 
constructed to keep metadata, checksums, and duplicates of critical data.

If the filesystem is given bad data it will faithfully lay it down on disk.  If 
that faulty data gets corrupt, ZFS will come in and save the day.

Regards,

Mike

On Nov 16, 2010, at 11:28, Edward Ned Harvey  wrote:

>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Toby Thain
>> 
>> The corruption will at least be detected by a scrub, even in cases where
> it
>> cannot be repaired.
> 
> Not necessarily.  Let's suppose you have some bad memory, and no ECC.  Your
> application does 1 + 1 = 3.  Then your application writes the answer to a
> file.  Without ECC, the corruption happened in memory and went undetected.
> Then the corruption was written to file, with a correct checksum.  So in
> fact it's not filesystem corruption, and ZFS will correctly mark the
> filesystem as clean and free of checksum errors.
> 
> In conclusion:
> 
> Use ECC if you care about your data.
> Do backups if you care about your data.
> 
> Don't be a cheapskate, or else, don't complain when you get bitten by lack
> of adequate data protection.
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS

2010-11-16 Thread Miles Nordin
> "tc" == Tim Cook  writes:

tc> Channeling Ethernet will not make it any faster. Each
tc> individual connection will be limited to 1gbit.  iSCSI with
tc> mpxio may work, nfs will not.

well...probably you will run into this problem, but it's not
necessarily totally unsolved.

I am just regurgitating this list again, but:

 need to include L4 port number in the hash:
 
http://www.cisco.com/en/US/products/ps9336/products_tech_note09186a0080a963a9.shtml#eclb
  port-channel load-balance mixed  -- for L2 etherchannels
  mls ip cef load-sharing full -- for L3 routing (OSPF ECMP)

  nexus makes all this more complicated.  there are a few ways that
  seem they'd be able to accomplish ECMP:
   FTag flow markers in ``FabricPath'' L2 forwarding
   LISP
   MPLS
  the basic scheme is that the L4 hash is performed only by the edge
  router and used to calculate a label.  The routing protocol will
  either do per-hop ECMP (FabricPath / IS-IS) or possibly some kind of
  per-entire-path ECMP for LISP and MPLS.  unfortunately I don't
  understand these tools well enoguh to lead you further, but if
  you're not using infiniband and want to do >10way ECMP this is
  probably where you need to look.

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6817942
 feature added in snv_117, NFS client connections can be spread over multiple 
TCP connections
 When rpcmod:clnt_max_conns is set to a value > 1
 however Even though the server is free to return data on different
 connections, [it does not seem to choose to actually do so] --
 6696163 fixed snv_117

  nfs:nfs3_max_threads=32
  in /etc/system, which changes the default 8 async threads per mount to
  32.  This is especially helpful for NFS over 10Gb and sun4v

  this stuff gets your NFS traffic onto multiple TCP circuits, which
  is the same thing iSCSI multipath would accomplish.  From there, you
  still need to do the cisco/??? stuff above to get TCP circuits
  spread across physical paths.

  
http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html
-- suspect.  it advises ``just buy 10gig'' but many other places
   say 10G NIC's don't perform well in real multi-core machines
   unless you have at least as many TCP streams as cores, which is
   honestly kind of obvious.  lego-netadmin bias.


pgputFUSXDRds.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot delete file when fs 100% full

2010-11-16 Thread tanisha singh
Hi. I runned into that damn problem too. And after days of searching I finally 
found this software: Delete Long Path File Tool.

It's GREAT. You can find it here: http://www.deletelongfile.com";>www.deletelongfile.com
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool versions

2010-11-16 Thread Ian Collins

On 11/17/10 05:45 AM, Cindy Swearingen wrote:

Hi Ian,

The pool and file system version information is available in
the ZFS Administration Guide, here:

http://docs.sun.com/app/docs/doc/821-1448/appendixa-1?l=en&a=view

The OpenSolaris version pages are up-to-date now also.


Thanks Cindy!

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool import is this safe to use "-f" option in this case ?

2010-11-16 Thread sridhar surampudi
Hi,

I have done the following (which is required for my case)

Created a zpool (smpool) on a device/LUN from an array (IBM 6K) on host1
created a array level snapshot of the device using "dscli" to another device 
which is successful.
Now I make the snapshot device visible to another host (host2)

I tried "zpool import smpool". Got a warning message that host1 is using this 
pool (might be the smpool metata data has stored this info) and asked to use 
"-f"

When i tried zpool import with -f option, I am able to successfully import to 
host2 and able to access all file systems and snapshots. 

My query is in this scenario is always safe to use "-f" to import ??
would there be any issues ? 
Also I have observed that zpool import took some time to for successful 
completion. Is there a way minimize "zpool import -f" operation time ??

Regards,
sridhar.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool versions

2010-11-16 Thread Cindy Swearingen

Hi Ian,

The pool and file system version information is available in
the ZFS Administration Guide, here:

http://docs.sun.com/app/docs/doc/821-1448/appendixa-1?l=en&a=view

The OpenSolaris version pages are up-to-date now also.

Thanks,

Cindy

On 11/15/10 16:42, Ian Collins wrote:

Is there an up to date reference following on from

http://hub.opensolaris.org/bin/view/Community+Group+zfs/24

listing what's in the zpool versions up to the current 31?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-16 Thread David Magda

On Nov 15, 2010, at 14:36, David Magda wrote:



Looking forwarding to playing with it. Some questions:
1. Is it possible to do a 'zfs create -o encryption=off
tank/darren/music' after the above command? I don't much care if my  
MP3s

are encrypted. :)
2. Both CCM and GCM modes of operation are supported: can you  
recommended

which mode should be used when? I'm guessing it's best to accept the
default if you're not sure, but what if we want to expand our  
knowledge?


For (2), just posted:

http://blogs.sun.com/darren/entry/choosing_a_value_for_the

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Crypto in Oracle Solaris 11 Express

2010-11-16 Thread Darren J Moffat

On 11/15/10 19:36, David Magda wrote:

On Mon, November 15, 2010 14:14, Darren J Moffat wrote:

Today Oracle Solaris 11 Express was released and is available for
download[1], this release includes on disk encryption support for ZFS.

Using ZFS encryption support can be as easy as this:

  # zfs create -o encryption=on tank/darren
  Enter passphrase for 'tank/darren':
  Enter again:


Looking forwarding to playing with it. Some questions:
  1. Is it possible to do a 'zfs create -o encryption=off
tank/darren/music' after the above command? I don't much care if my MP3s
are encrypted. :)


No, all child filesystems must be encrypted as well.  This is to avoid 
problems with mounting during boot / pool import.  It is possible this 
could be relaxed in the future but it is highly dependent on some other 
things that may not work out.



  2. Both CCM and GCM modes of operation are supported: can you recommended
which mode should be used when? I'm guessing it's best to accept the
default if you're not sure, but what if we want to expand our knowledge?


You've preempted my next planned posting ;-)  But I'll attempt to give 
an answer here:


'on' maps to aes-128-ccm, because it is the fastest of the 6 available
modes of encryption currently provided.  Also I believe it is the 
current wisdom of cryptographers (which I do not claim to be) that AES 
128 is the preferred key length due to recent discoveries about AES 256 
that are not know to impact AES 128.


Both CCM[1] and GCM[2] are provided so that if one turns out to have 
flaws hopefully the other will still be available for use safely even 
though they are roughly similar styles of modes.


On systems without hardware/cpu support for Galios multiplication (Intel 
Westmere and later and SPARC T3 and later) GCM will be slower because 
the Galios field multiplication has to happen in software without any 
hardware/cpu assist.  However depending on your workload you might not 
even notice the difference.


One reason you may want to select aes-128-gcm rather than aes-128-ccm is 
that GCM is one of the modes for AES in NSA Suite B[3], but CCM is not.


Are there symmetric algorithms other than AES that are of interest ?
The wrapping key algorithm currently matches the data encryption key 
algorithm, is there interest in providing different wrapping key 
algorithms and configuration properties for selecting which one ?  For 
example doing key wrapping with an RSA keypair/certificate ?


[1] http://en.wikipedia.org/wiki/CCM_mode
[2] http://en.wikipedia.org/wiki/Galois/Counter_Mode
[3] http://en.wikipedia.org/wiki/NSA_Suite_B_Cryptography

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Changing GUID

2010-11-16 Thread Phil Harman
Actually, I did this very thing a couple of years ago with M9000s and EMC DMX4s 
... with the exception of the "same host" requirement you have (i.e. the thing 
that requires the GUID change).

If you want to import the pool back into the host where the cloned pool is also 
imported, it's not just the zpool's GUID that needs to be changed, but all the 
vdevs in the pool too.

When I did some work on OpenSolairis in Amazon S3, I noticed that someone had 
build a zpool mirror split utility (before we had the real thing) as a means to 
clone boot disk images. IIRC it was just a hack of zdb, but with the ZFS source 
out there it's not that impossible to take a zpool and change all its GUIDs, 
it's just not that trivial (the Amazon case only handled a single simple 
mirrored vdev).

Anyway, back to my EMC scenario...

The dear data centre staff I had to work with mandated the use of good old EMC 
BCVs. I pointed out that ZFS's "always consistent in disk" promise meant that 
it would "just work" but that this required an consistent snapshot of all the 
LUNs in the pool (a feature in addition to basic BCVs that EMC charged even 
more for). Hoping to save money, my customer ignored my advice, and very 
quickly learned the error of their ways!

The "always consistent on disk" promise cannot be honoured if the vdev are 
snapshot at different times. On a quiet system you may get lucky in simple 
tests, only to find that a snapshot from a busy production system causes a 
system panic on import (although the more recent automatic uberblock recovery 
may save you).

The other thing I would add to your procedure is to take a ZFS snapshot just 
before taking the storage level snapshot. You could sync this with quiescing 
applications, but the real benefit is that you have a known point in time where 
all non-sync application level writes are temporally consistent.

Phil
http://harmanholistix.com

On 15 Nov 2010, at 10:11, sridhar surampudi  wrote:

> Hi I am looking in similar lines,
> 
> my requirement is 
> 
> 1. create a zpool on one or many devices ( LUNs ) from an array ( array can 
> be IBM or HPEVA or EMC etc.. not SS7000).
> 2. Create file systems on zpool
> 3. Once file systems are in use (I/0 is happening) I need to take snapshot at 
> array level
> a. Freeze the zfs flle system ( not required due to zfs consistency : source 
> : mailing groups)
> b. take array snapshot ( say .. IBM flash copy )
> c. Got new snapshot device (having same data and metadata including same GUID 
> of source pool)
> 
>  Now I need a way to change the GUID and pool of snapshot device so that the 
> snapshot device can be accessible on same host or an alternate host (if the 
> LUN is shared).
> 
> Could you please post commands for the same.
> 
> Regards,
> sridhar.
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New system, Help needed!

2010-11-16 Thread Richard Elling
On Nov 15, 2010, at 8:48 AM, Frank wrote:

> I am a newbie on Solaris. 
> We recently purchased a Sun Sparc M3000 server. It comes with 2 identical 
> hard drives. I want to setup a raid 1. After searching on google, I found 
> that the hardware raid was not working with M3000. So I am here to look for 
> help on how to setup ZFS to use raid 1. Currently one hard drive is installed 
> with Solaris 10 10/09, I want to setup ZFS raid 1 without reinstalling 
> Solaris, it that possible, and how can I do that. 

The process is documented in the ZFS Administration Guide.
http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/zfsadmin.pdf
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Excruciatingly slow resilvering on X4540 (build 134)

2010-11-16 Thread Richard Elling
Measure the I/O performance with iostat.  You should see something that
looks sorta like (iostat -zxCn 10):
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 5948.9  349.3 40322.3 5238.1 0.1 16.70.02.7   0 330 c9
3.70.0  230.70.0  0.0  0.10.0   13.5   0   2 c9t1d0
  845.00.0 5497.40.0  0.0  0.90.01.1   1  32 c9t2d0
3.80.0  230.70.0  0.0  0.00.0   10.6   0   1 c9t3d0
  845.20.0 5495.40.0  0.0  0.90.01.1   1  32 c9t4d0
3.80.0  237.10.0  0.0  0.00.0   10.4   0   1 c9t5d0
  841.40.0 5519.70.0  0.0  0.90.01.1   1  32 c9t6d0
3.80.0  237.30.0  0.0  0.00.09.2   0   1 c9t7d0
  843.50.0 5485.20.0  0.0  0.90.01.1   1  31 c9t8d0
3.70.0  230.80.0  0.0  0.10.0   15.2   0   2 c9t9d0
  850.20.0 5488.60.0  0.0  0.90.01.1   1  31 c9t10d0
3.10.0  211.20.0  0.0  0.00.0   13.2   0   1 c9t11d0
  847.90.0 5523.40.0  0.0  0.90.01.1   1  31 c9t12d0
3.10.0  204.90.0  0.0  0.00.09.6   0   1 c9t13d0
  847.20.0 5506.00.0  0.0  0.90.01.1   1  31 c9t14d0
3.40.0  224.10.0  0.0  0.00.0   12.3   0   1 c9t15d0
0.0  349.30.0 5238.1  0.0  9.90.0   28.4   1 100 c9t16d0

Here you can clearly see a raidz2 resilver in progress. c9t16d0
is the disk being resilvered (write workload) and half of the 
others are being read to generate the resilvering data.  Note
the relative performance and the ~30% busy for the surviving
disks.  If you see iostat output that looks significantly different
than this, then you might be seeing one of two common causes:

1. Your version of ZFS has the new resilver throttle *and* the
  pool is otherwise servicing I/O.

2. Disks are throwing errors or responding very slowly.  Use
  fmdump -eV to observe error reports.

 -- richard

On Nov 1, 2010, at 12:33 PM, Mark Sandrock wrote:

> Hello,
> 
>   I'm working with someone who replaced a failed 1TB drive (50% utilized),
> on an X4540 running OS build 134, and I think something must be wrong.
> 
> Last Tuesday afternoon, zpool status reported:
> 
> scrub: resilver in progress for 306h0m, 63.87% done, 173h7m to go
> 
> and a week being 168 hours, that put completion at sometime tomorrow night.
> 
> However, he just reported zpool status shows:
> 
> scrub: resilver in progress for 447h26m, 65.07% done, 240h10m to go
> 
> so it's looking more like 2011 now. That can't be right.
> 
> I'm hoping for a suggestion or two on this issue.
> 
> I'd search the archives, but they don't seem searchable. Or am I wrong about 
> that?
> 
> Thanks.
> Mark (subscription pending)
> 
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
ZFS and performance consulting
http://www.RichardElling.com













___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + L2ARC + Cluster.

2010-11-16 Thread Richard Elling
comment below...

On Nov 15, 2010, at 4:21 PM, Matt Banks wrote:
> 
> On Nov 15, 2010, at 4:15 PM, Erik Trimble wrote:
> 
>> On 11/15/2010 2:55 PM, Matt Banks wrote:
>>> I asked this on the x86 mailing list (and got a "it should work" answer), 
>>> but this is probably more of the appropriate place for it.
>>> 
>>> In a 2 node Sun Cluster (3.2 running Solaris 10 u8, but could be running u9 
>>> if needed), we're looking at moving from VXFS to ZFS.  However, quite 
>>> frankly, part of the rationale is L2ARC.  Would it be possible to use 
>>> internal (SSD) storage for the L2ARC in such a scenario?  My understanding 
>>> is that if a ZFS filesystem is passed from one node to another, the L2ARC 
>>> has to be rebuilt.  So, why can't it just be rebuilt on internal storage?
>>> 
>>> The nodes (x4240's) are identical and would have identical storage 
>>> installed, so the paths would be the same.
>>> 
>>> Has anyone done anything similar to this?  I'd love something more than "it 
>>> should work" before dropping $25k on SSD's...
>>> 
>>> TIA,
>>> matt
>> 
>> If your SSD is part of the shared storage (and, thus, visible from both 
>> nodes), then it will be part of the whole pool when exported/imported by the 
>> cluster failover software.
>> 
>> If, on the other hand, you have an SSD in each node that is attached to the 
>> shared-storage as L2ARC, then it's not visible to the other node, and the 
>> L2ARC would have to be reattached and rebuilt in a failover senario.
>> 
>> 
>> 
>> If you are using only X4240 systems ONLY, then you don't have ANY shared 
>> storage - ZFS isn't going to be able to "failover" between the two nodes.  
>> You'd have to mirror the data between the two nodes somehow; they wouldn't 
>> be part of the same zpool.
>> 
>> 
>> Really, what you want is something like a J4000-series dual-attached to both 
>> X4240, with SSDs and HDs installed in the J4000-series chassis, not in the 
>> X4240s.
> 
> 
> 
> Believe you me, had the standalone j4x00's not been EOL'd on 24-Sept-10 (and 
> if they supported SSD's), or if the 2540's/2501 we have attached to this 
> cluster supported SSD's, that would be my first choice (honestly, I LOVE the 
> j4x00's - we get great performance out of them every time we've installed 
> them - better at times than 2540's or 6180's).  However, at this point, the 
> only real choice we seem to have for external storage from Oracle is an F5100 
> or stepping up to a 6580 with a CSM2 or a 7120.  The 6580 obviously ain't 
> gonna happen and a 7120 leaves us with NFS and NFS+Solaris+Intersystems Caché 
> has massive performance issues.  The F5100 may be an option, but I'd like to 
> explore this first.
> 
> (In the interest of complete description of this particular configuration: we 
> have 2x 2540's - one of which has a 2501 attached to it - attached to 2x 
> x4240's.  The 2540's are entirely populated with SATA 7200k rpm drives.  The 
> external file systems are VXFS at this point and managed by Volume Manager 
> and have been in production for well over a year.  When these systems were 
> installed, ZFS still wasn't an option for us.)
> 
> I'm OK having to rebuild the L2ARC cache in case of a failover.  

The L2ARC is rebuilt any time the pool is imported.  If the L2ARC devices are 
not
found, then the pool is still ok, but will be listed as degraded (see the 
definition of 
degraded in the zpool man page).  This is harmless from a data protection 
viewpoint,
though if you intend to run that way for a long time, you might just remove the 
L2ARC
from the pool.

In the case of clusters with the L2ARC unshared, we do support this under 
NexentaStor 
HA-Cluster and it is a fairly common case. I can't speak for what Oracle can 
"support."
 -- richard

> They don't happen often.  And it's not like this is entirely unprecedented.  
> This is exactly the model Oracle uses for the 7000 series storage with 
> cluster nodes.  The "readzillas" (or whatever they're called now) are in the 
> cluster nodes - meaning if one fails, the other takes over and has to rebuild 
> its L2ARC.
> 
> I'm talking about having an SSD (or more, but let's say 1 for simplicity's 
> sake) in each of the x4240's.  One is sitting unused in node b waiting for 
> node a to fail.  Node a's SSD is in use as L2ARC.  Then, node a fails, the 
> ZFS file systems fail over, and then node b's SSD (located at the same path 
> as it was in node a) is used as L2ARC for the failed over file system.
> 
> The $2,400 for two Marlin SSD's is a LOT less money than the $47k (incl 
> HBA's) the "lowend" F5100 would run (MSRP).
> 
> matt
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Changing GUID

2010-11-16 Thread Richard Elling
On Nov 15, 2010, at 2:11 AM, sridhar surampudi wrote:
> Hi I am looking in similar lines,
> 
> my requirement is 
> 
> 1. create a zpool on one or many devices ( LUNs ) from an array ( array can 
> be IBM or HPEVA or EMC etc.. not SS7000).
> 2. Create file systems on zpool
> 3. Once file systems are in use (I/0 is happening) I need to take snapshot at 
> array level
> a. Freeze the zfs flle system ( not required due to zfs consistency : source 
> : mailing groups)
> b. take array snapshot ( say .. IBM flash copy )
> c. Got new snapshot device (having same data and metadata including same GUID 
> of source pool)
> 
>  Now I need a way to change the GUID and pool of snapshot device so that the 
> snapshot device can be accessible on same host or an alternate host (if the 
> LUN is shared).

Methinks you need to understand a little bit of architecture.  If you have an
exact copy, then it is indistinguishable from the original. If ZFS (or insert 
favorite
application here) sees two identical views of the data that are not, in fact,
identical, then you break the assumption that the application makes. By changing
the GUID you are forcing them to not be identical, which is counter to the whole
point of "hardware snapshots."  Perhaps what you are trying to do and the 
method you have chosen are not compatible.

BTW, I don't understand why you make a distinction between other arrays and the
SS7000 above. If I make a snapshot of a zvol, then it is identical from the 
client's
perspective, and the same conditions apply.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss