Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Richard Elling

On Sep 1, 2009, at 1:28 PM, Jason wrote:


I guess I should come at it from the other side:

If you have 1 iscsi target box and it goes down, you're dead in the  
water.


Yep.

If you have 2 iscsi target boxes that replicate and one dies, you  
are OK but you then have to have a 2:1 total storage to usable ratio  
(excluding expensive shared disks).


Servers cost more than storage, especially when you consider power.

If you have 2 tiers where you have n cheap back-end iSCSI targets  
that have the physical disks in them and present them to 2 clustered  
virtual iSCSI target servers (assuming this can be done with disks  
over iSCSI) that are presenting the iSCSI targets to the VMware  
hosts, then any one server could go down but everything would keep  
running.  It would create a virtual clustered pair that is basically  
doing RAID over the network (iSCSI).  Since you already have the  
VMware hosts, the 2 virtual ones are "free".  None of the back-end  
servers would need redundant components because any one can fail, so  
you should be able to build them with inexpensive parts.


This will certainly work. But it is, IMHO, too complicated to be  
effective

at producing high availability services.  Too many parts means too many
opportunities for failure (yes, even VMWare fails). The problem with  
your

approach is that you seem to only be considering failures of the type
"its broke, so it is completely dead." Those aren't the kind of  
failures that

dominate real life.

When we design highly available systems for the datacenter, we spend
a lot of time on rapid recovery. We know things will break, so we try  
to build
systems and processes that can recover as quickly as possible. This  
leads

to the observation that reliability trumps redundancy -- though we build
fast recovery systems, it is better to not need to recover. Hence we  
developed

dependability benchmarks which expose the cost/dependability trade-offs.
More reliable parts tend to cost more, but the best approach is to have
fewer reliable parts rather than more unreliable parts.

This would also allow you to add/replace storage easily (I hope).   
Perhaps you'd have to RAIDZ the backend disks together and then  
present them to the front-end which would RAIDZ all the back-ends  
together.  For example, if you had 5 backend boxes with 8 drives  
each you'd have a 10:7 ratio.  I'm sure the RAID combinations could  
be played with to get the balance of redundancy and capacity that  
you need.  I don't know what kind of performance hit you would take  
doing that over iSCSI but I thought it might work as long as you  
have gigabit speeds.  Or I could be completely off my rocker. :) Am I?


Don't worry about bandwidth. It is the latency that will kill  
performance.

Adding more stuff between your CPU and the media means increasing
latency.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Scott Meilicke
You are completely off your rocker :)

No, just kidding. Assuming the virtual front-end servers are running on 
different hosts, and you are doing some sort of raid, you should be fine. 
Performance may be poor due to the inexpensive targets on the back end, but you 
probably know that. A while back I thought of doing similar stuff using local 
storage on my ESX hosts, and abstracting that with an OpenSolaris VM and 
iSCSI/NFS.

Perhaps consider inexpensive but decent NAS/SAN devices from Synology. They are 
not expensive, offer NFS and iSCSI, and you can also replicate/backup between 
two of them using rsync. Yes, you would be 'wasting' the storage space by 
having two, but like I said, they are inexpensive. Then you would not have the 
two layer architecture.  

I just tested a two disk model, using ESXi 3.5u4 and a Windows VM. I used 
iometer, realworld test, and IOs were about what you would expect from mirrored 
7200 SATA drives - 138 IOPS, about 1.1 Mbps. The internal CPU was around 20%, 
RAM usage was 128MB out of the 512MB on board, so it was disk limited. 

The Dell 2950 that I have 2009.06 installed on (16GB of RAM and an LSI HBA with 
an external SAS enclosure) with a single mirror using two 7200 drives gave me 
about 200 IOPS using the same test, presumably because of the large amounts of 
RAM for the L2ARC cache.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Jason
I guess I should come at it from the other side:

If you have 1 iscsi target box and it goes down, you're dead in the water.

If you have 2 iscsi target boxes that replicate and one dies, you are OK but 
you then have to have a 2:1 total storage to usable ratio (excluding expensive 
shared disks).

If you have 2 tiers where you have n cheap back-end iSCSI targets that have the 
physical disks in them and present them to 2 clustered virtual iSCSI target 
servers (assuming this can be done with disks over iSCSI) that are presenting 
the iSCSI targets to the VMware hosts, then any one server could go down but 
everything would keep running.  It would create a virtual clustered pair that 
is basically doing RAID over the network (iSCSI).  Since you already have the 
VMware hosts, the 2 virtual ones are "free".  None of the back-end servers 
would need redundant components because any one can fail, so you should be able 
to build them with inexpensive parts.  

This would also allow you to add/replace storage easily (I hope).  Perhaps 
you'd have to RAIDZ the backend disks together and then present them to the 
front-end which would RAIDZ all the back-ends together.  For example, if you 
had 5 backend boxes with 8 drives each you'd have a 10:7 ratio.  I'm sure the 
RAID combinations could be played with to get the balance of redundancy and 
capacity that you need.  I don't know what kind of performance hit you would 
take doing that over iSCSI but I thought it might work as long as you have 
gigabit speeds.  Or I could be completely off my rocker. :) Am I?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Richard Elling

On Sep 1, 2009, at 12:17 PM, Jason wrote:

True, though an enclosure for shared disks is expensive.  This isn't  
for production but for me to explore what I can do with x86/x64  
hardware.  The idea being that I can just throw up another x86/x64  
box to add more storage.  Has anyone tried anything similar?


You mean something like this?
   disk  server ---+
  +-- server --- network --- client
   disk  server ---+

I'm not sure how that can be less expensive in the TCO sense.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Tim Cook
On Tue, Sep 1, 2009 at 2:17 PM, Jason  wrote:

> True, though an enclosure for shared disks is expensive.  This isn't for
> production but for me to explore what I can do with x86/x64 hardware.  The
> idea being that I can just throw up another x86/x64 box to add more storage.
>  Has anyone tried anything similar?
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


I still don't understand why you need this two layer architecture.  Just add
a server to the mix, and add the new storage to vmware.  If you're doing
iSCSI, you'll hit the LUN size limitations long before you'll need a second
box.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Jason
True, though an enclosure for shared disks is expensive.  This isn't for 
production but for me to explore what I can do with x86/x64 hardware.  The idea 
being that I can just throw up another x86/x64 box to add more storage.  Has 
anyone tried anything similar?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Richard Elling

On Sep 1, 2009, at 11:45 AM, Jason wrote:

So aside from the NFS debate, would this 2 tier approach work?  I am  
a bit fuzzy on how I would get the RAIDZ2 redundancy but still  
present the volume to the VMware host as a raw device.  Is that  
possible or is my understanding wrong?  Also could it be defined as  
a clustered resource?


The easiest and proven method is to use shared disks, two heads,
ZFS, and Open HA Cluster to provide highly available NFS or iSCSI
targets. This the fundamental architecture for most HA implementations.

An implementation, which does not use Open HA Cluster, is available
in appliance form as the Sun Storage 7310 or 7410 Cluster System.
But if you are building your own, Open HA Cluster may be a better
choice than rolling your own cluster software.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-09-01 Thread Jason
So aside from the NFS debate, would this 2 tier approach work?  I am a bit 
fuzzy on how I would get the RAIDZ2 redundancy but still present the volume to 
the VMware host as a raw device.  Is that possible or is my understanding 
wrong?  Also could it be defined as a clustered resource?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Erik Trimble
On Mon, 2009-08-31 at 18:26 -0400, David Magda wrote:
> On Aug 31, 2009, at 17:29, Tim Cook wrote:
> 
> > I've got MASSIVE deployments of VMware on NFS over 10g that achieve  
> > stellar
> > performance (admittedly, it isn't on zfs).
> 
> Without a separate ZIL device NFS would probably be slower with NFS-- 
> hence why Sun's own appliances use SSDs.
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Hmm.

On a related note:  I'm looking to be using Sun's xVM on our Nehalem
(x4170) machines, and was assuming I'd be best off using iSCSI targets
exported from my ZFS-based disk machine.

Under xVM (xen-based, or possibly VirtualBox, too), would I be better
off having an iSCSI raw partition mounted on the xVM server, or using
NFS?  (assuming I would have SSD accelerators on the ZFS disk machine)

I'm looking at performance issues, not things like being able to grow
the image under xVM (I'm hosting QA machines in xVM). 



-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread David Magda

On Aug 31, 2009, at 17:29, Tim Cook wrote:

I've got MASSIVE deployments of VMware on NFS over 10g that achieve  
stellar

performance (admittedly, it isn't on zfs).


Without a separate ZIL device NFS would probably be slower with NFS-- 
hence why Sun's own appliances use SSDs.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Jason
Specifically I remember storage vmotion being supported on NFS last as well as 
jumbo frames.  Just the impression I get from past features, perhaps they are 
doing better with that.

I know the performance problem had specifically to do with ZFS and the way it 
handled something.  I know lots of implementations with just straight NFS so I 
know that works... I'm not opposed to NFS but I was hoping what he saw was just 
a combination of ZFS over NFS as he said he didn't know if it would happen over 
iSCSI.  So I thought I'd try that first.  I'll have to see if I can get the 
details from him tomorrow.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Tim Cook
On Mon, Aug 31, 2009 at 4:26 PM, Jason  wrote:

> Well, I knew a guy who was involved in a project to do just that for a
> production environment.  Basically they abandoned using that because there
> was a huge performance hit using ZFS over NFS.  I didn’t get the specifics
> but his group is usually pretty sharp.  I’ll have to check back with him.
>  So mainly just to avoid that, but also VMware tends to roll out storage
> features on NFS last after fibre and iSCSI.
>
> *sorry if this is duplicate... Learning the workings of this discussion
> forum as well*
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



That's not true at all.  Dynamic grow and shrink has been available on NFS
forever.  You STILL can't shrink vmfs, and they've JUST added grow
capabilities.  Not to mention it being thin provisioned by default.  As for
performance, I have a tough time believing his performance issues were
because of NFS, and not some other underlying bug.

I've got MASSIVE deployments of VMware on NFS over 10g that achieve stellar
performance (admittedly, it isn't on zfs).

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Jason
Well, I knew a guy who was involved in a project to do just that for a 
production environment.  Basically they abandoned using that because there was 
a huge performance hit using ZFS over NFS.  I didn’t get the specifics but his 
group is usually pretty sharp.  I’ll have to check back with him.  So mainly 
just to avoid that, but also VMware tends to roll out storage features on NFS 
last after fibre and iSCSI.

*sorry if this is duplicate... Learning the workings of this discussion forum 
as well*
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Tim Cook
On Mon, Aug 31, 2009 at 3:42 PM, Jason  wrote:

> I've been looking to build my own cheap SAN to explore HA scenarios with
> VMware hosts, though not for a production environment.  I'm new to
> opensolaris but I am familiar with other clustered HA systems.  The features
> of ZFS seem like they would fit right in with attempting to build an HA
> storage platform for VMware hosts on inexpensive hardware.
>
> Here is what I am thinking.  I want to have at least two clustered nodes
> (may be virtual running off the local storage of the VMware host) that act
> as the front end of the SAN.  These will not have any real storage
> themselves, but will be initiators for backend computers with the actual
> disks in them.  I want to be able to add and remove/replace at will so I
> figure the backends will just be fairly dumb iSCSI targets that just present
> each disk.  That way the front ends are close to the hardware for zfs to
> work best but would not limit a raid set to the capacity of a single
> enclosure.
>
> I'd like to present a RAIDZ2 array as a block device to VMware, how would
> that work?  Could that then be clustered so the iSCSI target is HA?  Am I
> completely off base or is there an easier way?  My goal is to be able to
> kill any one box (or multiple) and still keep the storage available for
> VMware, but still get a better total storage to usable ratio than just a
> plain mirror (2:1).  I also want to be able to add and remove storage
> dynamically.  You know, champagne on a beer budget. :)
>
>
Any particular reason you want to present block storage to VMware?  It works
as well, if not better over NFS, and saves a LOT of headaches.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS iSCSI Clustered for VMware Host use

2009-08-31 Thread Jason
I've been looking to build my own cheap SAN to explore HA scenarios with VMware 
hosts, though not for a production environment.  I'm new to opensolaris but I 
am familiar with other clustered HA systems.  The features of ZFS seem like 
they would fit right in with attempting to build an HA storage platform for 
VMware hosts on inexpensive hardware.

Here is what I am thinking.  I want to have at least two clustered nodes (may 
be virtual running off the local storage of the VMware host) that act as the 
front end of the SAN.  These will not have any real storage themselves, but 
will be initiators for backend computers with the actual disks in them.  I want 
to be able to add and remove/replace at will so I figure the backends will just 
be fairly dumb iSCSI targets that just present each disk.  That way the front 
ends are close to the hardware for zfs to work best but would not limit a raid 
set to the capacity of a single enclosure.  

I'd like to present a RAIDZ2 array as a block device to VMware, how would that 
work?  Could that then be clustered so the iSCSI target is HA?  Am I completely 
off base or is there an easier way?  My goal is to be able to kill any one box 
(or multiple) and still keep the storage available for VMware, but still get a 
better total storage to usable ratio than just a plain mirror (2:1).  I also 
want to be able to add and remove storage dynamically.  You know, champagne on 
a beer budget. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss