Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

Edward Ned Harvey Sat, 15 Oct 2011 06:15:33 -0700

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Tim Cook
> 
> In my example - probably not a completely clustered FS.
> A clustered ZFS pool with datasets individually owned by
> specific nodes at any given time would suffice for such
> VM farms. This would give users the benefits of ZFS
> (resilience, snapshots and clones, shared free space)
> merged with the speed of direct disk access instead of
> lagging through a storage server accessing these disks.


I think I see a couple of points of disconnect.

#1 - You seem to be assuming storage is slower when it's on a remote storage
server as opposed to a local disk.  While this is typically true over
ethernet, it's not necessarily true over infiniband or fibre channel.  That
being said, I don't want to assume everyone should be shoe-horned into
infiniband or fibre channel.  There are some significant downsides of IB and
FC.  Such as cost, and centralization of the storage.  Single point of
failure, and so on.  So there is some ground to be gained...  Saving cost
and/or increasing workload distribution and/or scalability.  One size
doesn't fit all.  I like the fact that you're thinking of something
different.

#2 - You're talking about a clustered FS, but the characteristics required
are more similar to a distributed filesystem.  In a clustered FS, you have
something like a LUN on a SAN, which is a raw device simultaneously mounted
by multiple OSes.  In a distributed FS, such as lustre, you have a
configurable level of redundancy (maybe zero) distributed across multiple
systems (maybe all) and meanwhile all hosts share the same namespace.  So
each system doing heavy IO is working at local disk speeds, but any system
trying to access data that was created by another system must access that
data remotely.

If the goal is ... to do something like VMotion, including the storage...
Doing something like VMotion would be largely pointless if the VM storage
still remains on the node that was previously the compute head.

So let's imagine for a moment that you have two systems, which are connected
directly to each other over infiniband or any bus whose remote performance
is the same as local performance.  You have a zpool mirror using the local
disk and the remote disk.  Then you should be able to (theoretically) do
something like VMotion from one system to the other, and kill the original
system.  Even if the original system dies ungracefully and the VM dies with
it, you can still boot up the VM on the second system, and the only loss
you've suffered was an ungraceful reboot.

If you do the same thing over ethernet, then the performance will be
degraded to ethernet speeds.  So take it for granted, no matter what you do,
you either need a bus that performs just as well remotely versus locally...
Or else performance will be degraded...  Or else it's kind of pointless
because the VM storage lives only on the system that you want to VMotion
away from.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

Reply via email to