Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread Vincent Fox
> We need high availability, so are looking at Sun
> Cluster. That seems to add
> an extra layer of complexity , but there's no
> way I'll get signoff on
> a solution without redundancy. It would appear that
> ZFS failover is
> supported with the latest version of Solaris/Sun
> Cluster? I was speaking
> with a Sun SE who claimed that ZFS would actually
> operate active/active in
> a cluster, simultaneously writable by both nodes.
> From what I had read, ZFS
> is not a cluster file system, and would only operate
> in the active/passive
> failover capacity. Any comments?

The SE is not correct.  There are relatively few applications in
Sun Cluster that run scalable.  Most of them are "failover".
ZFS is definitely not a global file system, so that's one problem.
And NFS is a failover service.

This can actually be an asset to you. Think of it this way, you
have a KNOWN capacity.  You do not have to worry that a failure
of one node at peak leaves you crippled.

Also have you ever had Sun patches break things? We
certainly have enough scars from that. You can patch the idle
node, do a cluster switch so it's now the active node, and verify
function for a few days before patching the other node.  If there's a
problem that crops up due to some new patch, you switch
it back the other way until you sort that out.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread Vincent Fox
> 
> 
> The SE also told me that Sun Cluster requires
> hardware raid, which
> conflicts with the general recommendation to feed ZFS
> raw disk. It seems
> such a configuration would either require configuring
> zdevs directly on the
> raid LUNs, losing ZFS self-healing and checksum
> correction features, or
> losing space to not only the hardware raid level, but
> a partially redundant
> ZFS level as well. What is the general consensus on
> the best way to deploy
> ZFS under a cluster using hardware raid?

I have a pair of 3510FC units, each export 2 RAID-5 (5-disk) LUNs.

On the T2000 to I map a LUN from each array into a mirror set, then add the 2nd 
set the same way into the ZFS pool.   I guess it's RAID-5+1+0.  Yes we have 
multipath SAN setup too.

e.g.

{cyrus1:vf5:133} zpool status -v
  pool: ms1
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
ms1ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c4t600C0FF00A73D97F16461700d0  ONLINE   0 0 0
c4t600C0FF00A719D7C1126E500d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c4t600C0FF00A73D94517C4A900d0  ONLINE   0 0 0
c4t600C0FF00A719D38B93FD200d0  ONLINE   0 0 0

errors: No known data errors

Works great.  Nothing beats having an entire 3510FC down and never having users 
notice there is a problem.  I was replacing a controller in the 2nd array and 
goofed up my cabling taking the entire array offline.  Not a hiccup in service, 
although I could see the problem in zpool status.  I sorted everything out 
plugged it up right, and everything was fine.

I like very much that the 3510 knows it has a global spare that is used for 
that array, and having that level of things handled locally.  In ZFS AFAICT, 
there is no way to specify what affinity a spare has so a spare from one array 
if it went hot to replace one in the other array, becomes an undesirable 
dependency.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread Paul B. Henson
On Tue, 25 Sep 2007, Peter Tribble wrote:

> This was some time ago (a very long time ago, actually). There are two
> fundamental problems:
>
> 1. Each zfs filesystem consumes kernel memory. Significant amounts, 64K
> is what we worked out at the time. For normal numbers of filesystems that's
> not a problem; multiply it by tens of thousands and you start to hit serious
> resource usage.

Every server we've bought for about the last year came with 4 GB of memory,
the servers we would deploy for this would have at least 8 if not 16GB.
Given the downtrend in memory prices, hopefully memory would not be an
issue.


> 2. The zfs utilities didn't scale well as the number of filesystems
> increased.
[...]
> share all those filesystems) are there to improve scalability. Perhaps
> I should find a spare machine and try repeating the experiment.

There have supposedly been lots of improvements in scalability, based on my
review of mailing list archives. If you do find the time to experiment
again, I'd appreciate hearing what you find...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread Paul B. Henson
On Mon, 24 Sep 2007, Dale Ghent wrote:

> Not to sway you away from ZFS/NFS considerations, but I'd like to add
> that people who in the past used DFS typically went on to replace it with
> AFS. Have you considered it?

You're right, AFS is the first choice coming to mind when replacing DFS. We
actually implemented an OpenAFS prototype last year and have been running
it for internal use only since then.

Unfortunately, like almost everything we've looked at, AFS is a step
backwards from DFS. As the precursor to DFS, AFS has enough similarities to
DFS to make the features it lacks almost more painful.

No per file access control lists is a serious bummer. Integration with
Kerberos 5 rather than the internal kaserver is still at a bit of a duct
tape level, and only support DES. Having to maintain an additional
repository of user/group information (pts) is a bit of a pain, while there
are long-term goals to replace that with some type of LDAP integration I
don't see that anytime soon.

One of the most annoying things is that AFS requires integration at the
kernel level, yet is not maintained by the same people that maintain the
kernel. Frequently a Linux kernel upgrade will break AFS, and the
developers need to scramble to release a patch or update to resolve it.
While we are not currently using AFS under Solaris, based on mailing list
traffic similar issues arise. One of the benefits of NFSv4 is that it is a
core part of the operating system, unlikely to be lightly broken during
updates.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread Peter Tribble
On 9/24/07, Paul B. Henson <[EMAIL PROTECTED]> wrote:
> On Sat, 22 Sep 2007, Peter Tribble wrote:
>
> > filesystem per user on the server, just to see how it would work. While
> > managing 20,00 filesystems with the automounter was trivial, the attempt
> > to manage 20,000 zfs filesystems wasn't entirely successful. In fact,
> > based on that experience I simply wouldn't go down the road of one user
> > per filesystem.
>
> Really? Could you provide further detail about what problems you
> experienced? Our current filesystem based on DFS effectively utilizes a
> separate filesystem per user (although in DFS terminology they are called
> filesets), and we've never had a problem managing them.

This was some time ago (a very long time ago, actually). There are two
fundamental problems:

1. Each zfs filesystem consumes kernel memory. Significant amounts, 64K
is what we worked out at the time. For normal numbers of filesystems that's
not a problem; multiply it by tens of thousands and you start to hit serious
resource usage.

2. The zfs utilities didn't scale well as the number of filesystems increased.

I just kept on issuing zfs create until the machine had had enough. It got
through the first 10,000 without too much difficulty (as I recall that
took several
hours), but soon got bogged down after that, to the point where it took a day
to do anything. At which point (at about 15000 filesystems on a 1G system)
it ran out of kernel memory and died. At this point it wouldn't even boot.

I know that some work has gone into improving the performance of the
utilities, and things like in-kernel sharetab (we never even tried to
share all those filesystems) are there to improve scalability. Perhaps
I should find a spare machine and try repeating the experiment.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-25 Thread James F. Hranicky
Paul B. Henson wrote:

> But all quotas were set in a single flat text file. Anytime you added a new
> quota, you needed to turn off quotas, then turn them back on, and quota
> enforcement was disabled while it recalculated space utilization.

I believe in later versions of the OS 'quota resize' did this without
the massive recalculation.

Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Dale Ghent
On Sep 24, 2007, at 6:15 PM, Paul B. Henson wrote:

> Well, considering that some days we automatically create accounts for
> thousands of students, I wouldn't want to be the one stuck typing 'zfs
> create' a thousand times 8-/. And that still wouldn't resolve our
> requirement for our help desk staff to be able to manage quotas  
> through our
> existing identity management system.

Not to sway you away from ZFS/NFS considerations, but I'd like to add  
that people who in the past used DFS typically went on to replace it  
with AFS. Have you considered it?

/dale
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Jonathan Loran



Paul B. Henson wrote:

On Sat, 22 Sep 2007, Jonathan Loran wrote:

  

My gut tells me that you won't have much trouble mounting 50K file
systems with ZFS.  But who knows until you try.  My questions for you is
can you lab this out?



Yeah, after this research phase has been completed, we're going to have to
go into a prototyping phase. I should be able to get funding for a half
dozen or so x4100 systems to play with. We standardized on those systems
for our Linux deployment.

  

test with your name service in the loop.  You may need netgroups to
delineate permissions for your shares, and to define your automounter
maps.



We're planning to use NFSv4 with Kerberos authentication, so shouldn't need
netgroups. Tentatively I think I'd put automounter maps in LDAP, although
doing so for both Solaris and Linux at the same time based on a little
quick research seems possibly problematic.
  
We finally got autofs maps via LDAP working smoothly with both Linux 
(CentOS 4.x and 5.x) and Solaris (8,9,10).  It took a lot of trial and 
error.  We settled on the Fedora Directory server, because that worked 
across the board.  I'm not the admin who did the leg work on that 
though, so I can't really comment as to where we ran into problems.  If 
you want, I can find out more on that and respond off the list.



Also, as you may know, Linux doesn't play well with hundreds of
concurrent mount operations.  If you use Linux NFS clients in your
environment, be sure to lab that out as well.



  
I didn't know that -- we're currently using RHEL 4 and Gentoo distributions

for a number of services. I've done some initial testing of NFSv4, but
never tried lots of simultaneous mounts...

  
Sort of an old problem, but using the insecure option in your 
exports/shares and mount opt helps.  May have been patched by now 
though.  Too much Linux talk for this list ;)

At any rate, you may indeed be an outlier with so many file systems and
NFS mounts, but I imagine many of us are waiting on the edge of our seats
to see if you can make it all work.  Speaking for my self, I would love
to know how ZFS, NFS and LDAP scale up to such a huge system.



I don't necessarily mind being a pioneer, but not on this particular
project -- it has a rather high visibility and it would not be good for it
to blow chunks after deployment when use starts scaling up 8-/.

  

Good luck.

--


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Mon, 24 Sep 2007, Richard Elling wrote:

> > Perhaps I should have been more clear -- a remote facility available via
> > programmatic access, not manual user direct access. If I wanted to do
>
> I'd argue that it isn't worth the trouble.
>   zfs create
>   zfs set
> is all that would be required.  If you are ok with inheritance, zfs create
> will suffice.

Well, considering that some days we automatically create accounts for
thousands of students, I wouldn't want to be the one stuck typing 'zfs
create' a thousand times 8-/. And that still wouldn't resolve our
requirement for our help desk staff to be able to manage quotas through our
existing identity management system.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Mon, 24 Sep 2007, Richard Elling wrote:

> Yes.  Sun currently has over 45,000 users with automounted home
> directories. I do not know how many servers are involved, though, in part
> because home directories are highly available services and thus their
> configuration is abstracted away from the clients.

Hmm, highly available home directories -- that sounds like what I'm looking
for ;).

Any other Sun employees on the list that might be able to provide further
details of the internal Sun ZFS/NFS auto mounted home directory
implementation?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Mon, 24 Sep 2007, Richard Elling wrote:

> I can't imagine a web server serving tens of thousands of pages.  I think
> you should put a more scalable architecture in place, if that is your
> goal. BTW, there are many companies that do this: google, yahoo, etc.
> In no case do they have a single file system or single server dishing out
> thousands of sites.

Our current implementation already serves tens of thousands of pages, and
it's for the most part running on 8-10 year old hardware. We have three
core DFS servers housing files, and three web servers serving content. The
only time we've ever had a problem was once we got Slashdot'd by a staff
member's personal project:

http://www.csupomona.edu/~jelerma/springfield/map/index.html


other than that, it's been fine. I can't imagine brand-new hardware running
shiny new filesystems couldn't handle the same load 10-year-old hardware
has been? Although arguably, considering I can't find anything equivalent
feature wise to DFS, perhaps the current offerings aren't equivalent
scalability-wise either :(...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Mike Gerdts
On 9/24/07, Paul B. Henson <[EMAIL PROTECTED]> wrote:
> but checking the actual release notes shows no ZFS mention. 3.0.26 to
> 3.2.0? That seems an odd version bump...

3.0.x and before are GPLv2.  3.2.0 and later are GPLv3.

http://news.samba.org/announcements/samba_gplv3/

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Sat, 22 Sep 2007, Jonathan Loran wrote:

> My gut tells me that you won't have much trouble mounting 50K file
> systems with ZFS.  But who knows until you try.  My questions for you is
> can you lab this out?

Yeah, after this research phase has been completed, we're going to have to
go into a prototyping phase. I should be able to get funding for a half
dozen or so x4100 systems to play with. We standardized on those systems
for our Linux deployment.

> test with your name service in the loop.  You may need netgroups to
> delineate permissions for your shares, and to define your automounter
> maps.

We're planning to use NFSv4 with Kerberos authentication, so shouldn't need
netgroups. Tentatively I think I'd put automounter maps in LDAP, although
doing so for both Solaris and Linux at the same time based on a little
quick research seems possibly problematic.

> Also, as you may know, Linux doesn't play well with hundreds of
> concurrent mount operations.  If you use Linux NFS clients in your
> environment, be sure to lab that out as well.

I didn't know that -- we're currently using RHEL 4 and Gentoo distributions
for a number of services. I've done some initial testing of NFSv4, but
never tried lots of simultaneous mounts...

> At any rate, you may indeed be an outlier with so many file systems and
> NFS mounts, but I imagine many of us are waiting on the edge of our seats
> to see if you can make it all work.  Speaking for my self, I would love
> to know how ZFS, NFS and LDAP scale up to such a huge system.

I don't necessarily mind being a pioneer, but not on this particular
project -- it has a rather high visibility and it would not be good for it
to blow chunks after deployment when use starts scaling up 8-/.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Sat, 22 Sep 2007, Peter Tribble wrote:

> filesystem per user on the server, just to see how it would work. While
> managing 20,00 filesystems with the automounter was trivial, the attempt
> to manage 20,000 zfs filesystems wasn't entirely successful. In fact,
> based on that experience I simply wouldn't go down the road of one user
> per filesystem.

Really? Could you provide further detail about what problems you
experienced? Our current filesystem based on DFS effectively utilizes a
separate filesystem per user (although in DFS terminology they are called
filesets), and we've never had a problem managing them.


> directories). This has been fixed, I believe, but only very recently in
> S10.]

As long as the fix has been included in U4 we should be good...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Paul B. Henson
On Fri, 21 Sep 2007, Ed Plese wrote:

> ZFS ACL support was going to be merged into 3.0.26 but 3.0.26 ended up
> being a security fix release and the merge got pushed back.  The next
> release will be 3.2.0 and ACL support will be in there.

Arg, you're right, I based that on the mailing list posting:

http://marc.info/?l=samba-technical&m=117918697907120&w=2

but checking the actual release notes shows no ZFS mention. 3.0.26 to
3.2.0? That seems an odd version bump...


> As others have pointed out though, Samba is included in Solaris 10
> Update 4 along with support for ZFS ACLs, Active Directory, and SMF.

I usually prefer to use the version directly from the source, but depending
on the timeliness of the release of 3.2.0 maybe I'll have to make an
exception. SMF I know is the new Solaris service management framework
replacing /etc/init.d scripts, but what additional active directory support
does the Sun branded samba include over stock?

> The patches for the shadow copy module can be found here:
>
> http://www.edplese.com/samba-with-zfs.html

Ah, I thought I recognized your name :), I came across that page while
researching ZFS. Thanks for your work on that patch , hopefully it will be
accepted into mainstream samba soon.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Richard Elling
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, Richard Elling wrote:
> 
>> 50,000 directories aren't a problem, unless you also need 50,000 quotas
>> and hence 50,000 file systems.  Such a large, single storage pool system
>> will be an outlier... significantly beyond what we have real world
>> experience with.
> 
> Yes, considering that 45,000 of those users will be students, we definitely
> need separate quotas for each one :).

or groups... think long tail.

> Hmm, I get a bit of a shiver down my spine at the prospect of deploying a
> critical central service in a relatively untested configuration 8-/. What
> is the maximum number of file systems in a given pool that has undergone
> some reasonable amount of real world deployment?

good question.  I might have some field data on this, but won't be able to
look at it for a month or three.  Perhaps someone on the list will brag ;-)

> One issue I have is that our previous filesystem, DFS, completely spoiled
> me with its global namespace and location transparency. We had three fairly
> large servers, with the content evenly dispersed among them, but from the
> perspective of the client any user's files were available at
> /dfs/user/, regardless of which physical server they resided on.
> We could even move them around between servers transparently.
> 
> Unfortunately, there aren't really any filesystems available with similar
> features and enterprise applicability. OpenAFS comes closest, we've been
> prototyping that but the lack of per file ACLs bites, and as an add-on
> product we've had issues with kernel compatibility across upgrades.
> 
> I was hoping to replicate a similar feel by just having one large file
> server with all the data on it. If I split our user files across multiple
> servers, we would have to worry about which server contained what files,
> which would be rather annoying.
> 
> There are some features in NFSv4 that seem like they might someday help
> resolve this problem, but I don't think they are readily available in
> servers and definitely not in the common client.
> 
>>> I was planning to provide CIFS services via Samba. I noticed a posting a
>>> while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
>>> into Samba, but I'm not sure if that was ever completed and shipped either
>>> in the Sun version or pending inclusion in the official version, does
>>> anyone happen to have an update on that? Also, I saw a patch proposing a
>>> different implementation of shadow copies that better supported ZFS
>>> snapshots, any thoughts on that would also be appreciated.
>> This work is done and, AFAIK, has been integrated into S10 8/07.
> 
> Excellent. I did a little further research myself on the Samba mailing
> lists, and it looks like ZFS ACL support was merged into the official
> 3.0.26 release. Unfortunately, the patch to improve shadow copy performance
> on top of ZFS still appears to be floating around the technical mailing
> list under discussion.
> 
>>> Is there any facility for managing ZFS remotely? We have a central identity
>>> management system that automatically provisions resources as necessary for
> [...]
>> This is a loaded question.  There is a webconsole interface to ZFS which can
>> be run from most browsers.  But I think you'll find that the CLI is easier
>> for remote management.
> 
> Perhaps I should have been more clear -- a remote facility available via
> programmatic access, not manual user direct access. If I wanted to do
> something myself, I would absolutely login to the system and use the CLI.
> However, the question was regarding an automated process. For example, our
> Perl-based identity management system might create a user in the middle of
> the night based on the appearance in our authoritative database of that
> user's identity, and need to create a ZFS filesystem and quota for that
> user. So, I need to be able to manipulate ZFS remotely via a programmatic
> API.

I'd argue that it isn't worth the trouble.
zfs create
zfs set
is all that would be required.  If you are ok with inheritance, zfs create
will suffice.

>> Active/passive only.  ZFS is not supported over pxfs and ZFS cannot be
>> mounted simultaneously from two different nodes.
> 
> That's what I thought, I'll have to get back to that SE. Makes me wonder as
> to the reliability of his other answers :).
> 
>> For most large file servers, people will split the file systems across
>> servers such that under normal circumstances, both nodes are providing
>> file service.  This implies two or more storage pools.
> 
> Again though, that would imply two different storage locations visible to
> the clients? I'd really rather avoid that. For example, with our current
> Samba implementation, a user can just connect to
> '\\files.csupomona.edu\' to access their home directory or
> '\\files.csupomona.edu\' to access a shared group directory.
> They don't need to worry on which physical server it resides or determine
> w

Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Richard Elling
Paul B. Henson wrote:
> On Fri, 21 Sep 2007, James F. Hranicky wrote:
> 
>>> It just seems rather involved, and relatively inefficient to continuously
>>> be mounting/unmounting stuff all the time. One of the applications to be
>>> deployed against the filesystem will be web service, I can't really
>>> envision a web server with tens of thousands of NFS mounts coming and
>>> going, seems like a lot of overhead.
>> Well, that's why ZFS wouldn't work for us :-( .
> 
> Although, I'm just saying that from my gut -- does anyone have any actual
> experience with automounting thousands of file systems? Does it work? Is it
> horribly inefficient? Poor performance? Resource intensive?

Yes.  Sun currently has over 45,000 users with automounted home directories.
I do not know how many servers are involved, though, in part because home
directories are highly available services and thus their configuration is
abstracted away from the clients.  Suffice to say, there is more than one
server.  Measuring mount performance would vary based on where in the world
you were, so it probably isn't worth the effort.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Richard Elling
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, James F. Hranicky wrote:
> 
>> This can be solved using an automounter as well.
> 
> Well, I'd say more "kludged around" than "solved" ;), but again unless
> you've used DFS it might not seem that way.
> 
> It just seems rather involved, and relatively inefficient to continuously
> be mounting/unmounting stuff all the time. One of the applications to be
> deployed against the filesystem will be web service, I can't really
> envision a web server with tens of thousands of NFS mounts coming and
> going, seems like a lot of overhead.
> 
 > I might need to pursue a similar route though if I can't get one large
 > system to house everything in one place.

I can't imagine a web server serving tens of thousands of pages.  I think
you should put a more scalable architecture in place, if that is your goal.
BTW, there are many companies that do this: google, yahoo, etc.  In no
case do they have a single file system or single server dishing out
thousands of sites.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-22 Thread Jonathan Loran


Paul,

My gut tells me that you won't have much trouble mounting 50K file 
systems with ZFS.  But who knows until you try.  My questions for you is 
can you lab this out?  you could build a commodity server with a ZFS 
pool on it.  Heck it could be a small pool, one disk, and then put your 
50K file systems on that.  Reboot, thrash about, and see what happens.  
Then the next step would be fooling with the client side of things.  If 
you can get time on a chunk of your existing client systems, see if you 
can mount a bunch of those 50K file systems smoothly.  Off hours, 
perhaps.  The next problem of course, and to be honest, this may be the 
killer, test with your name service in the loop.  You may need netgroups 
to delineate permissions for your shares, and to define your automounter 
maps.  In my personal experience, with about 1-2% as many shares and 
mount points as you need, the name servers gets stressed out really 
fast.  There have been some issues around LDAP port reuse in Solaris 
that can cause some headaches as well, but there are patches to help you 
too.  Also, as you may know, Linux doesn't play well with hundreds of 
concurrent mount operations.  If you use Linux NFS clients in your 
environment, be sure to lab that out as well.


At any rate, you may indeed be an outlier with so many file systems and 
NFS mounts, but I imagine many of us are waiting on the edge of our 
seats to see if you can make it all work.  Speaking for my self, I would 
love to know how ZFS, NFS and LDAP scale up to such a huge system. 


Regards,

Jon

Paul B. Henson wrote:

On Fri, 21 Sep 2007, James F. Hranicky wrote:

  

It just seems rather involved, and relatively inefficient to continuously
be mounting/unmounting stuff all the time. One of the applications to be
deployed against the filesystem will be web service, I can't really
envision a web server with tens of thousands of NFS mounts coming and
going, seems like a lot of overhead.
  

Well, that's why ZFS wouldn't work for us :-( .



Although, I'm just saying that from my gut -- does anyone have any actual
experience with automounting thousands of file systems? Does it work? Is it
horribly inefficient? Poor performance? Resource intensive?


  

Makes sense -- in that case you would be looking at multiple SMB servers,
though.



Yes, with again the resultant problem of worrying about where a user's
files are when they want to access them :(.


  


--


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-22 Thread Peter Tribble
On 9/22/07, Paul B. Henson <[EMAIL PROTECTED]> wrote:
> On Fri, 21 Sep 2007, James F. Hranicky wrote:
>
> > > It just seems rather involved, and relatively inefficient to continuously
> > > be mounting/unmounting stuff all the time. One of the applications to be
> > > deployed against the filesystem will be web service, I can't really
> > > envision a web server with tens of thousands of NFS mounts coming and
> > > going, seems like a lot of overhead.
> >
> > Well, that's why ZFS wouldn't work for us :-( .
>
> Although, I'm just saying that from my gut -- does anyone have any actual
> experience with automounting thousands of file systems? Does it work? Is it
> horribly inefficient? Poor performance? Resource intensive?

Used to do this for years with 20,000 filesystems automounted - each user
home directory was automounted separately. Never caused any problems,
either with NIS+ or the automounter or the NFS clients and server. And much
of the time that was with hardware that would today be antique. So I wouldn't
expect any issues on the automounting part. [Except one - see later.]

That was with a relatively small number of ufs filesystems on the server holding
the data. When we first got hold of zfs I did try the exercise of one zfs
filesystem per user on the server, just to see how it would work. While managing
20,00 filesystems with the automounter was trivial, the attempt to manage
20,000 zfs filesystems wasn't entirely successful. In fact, based on
that experience
I simply wouldn't go down the road of one user per filesystem.

[There is one issue with automounting large number of filesystems on
a Solaris 10 client. Every mount or unmount triggers SMF activity, and
can drive SMF up the wall. We saw one of the svc daemons hog a whole
cpu on our mailserver (constantly checking for .forward files in user home
directories). This has been fixed, I believe, but only very recently in S10.]

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Ed Plese
On Thu, Sep 20, 2007 at 12:49:29PM -0700, Paul B. Henson wrote:
> > > I was planning to provide CIFS services via Samba. I noticed a posting a
> > > while back from a Sun engineer working on integrating NFSv4/ZFS ACL 
> > > support
> > > into Samba, but I'm not sure if that was ever completed and shipped either
> > > in the Sun version or pending inclusion in the official version, does
> > > anyone happen to have an update on that? Also, I saw a patch proposing a
> > > different implementation of shadow copies that better supported ZFS
> > > snapshots, any thoughts on that would also be appreciated.
> >
> > This work is done and, AFAIK, has been integrated into S10 8/07.
> 
> Excellent. I did a little further research myself on the Samba mailing
> lists, and it looks like ZFS ACL support was merged into the official
> 3.0.26 release. Unfortunately, the patch to improve shadow copy performance
> on top of ZFS still appears to be floating around the technical mailing
> list under discussion.

ZFS ACL support was going to be merged into 3.0.26 but 3.0.26 ended up
being a security fix release and the merge got pushed back.  The next
release will be 3.2.0 and ACL support will be in there.

As others have pointed out though, Samba is included in Solaris 10
Update 4 along with support for ZFS ACLs, Active Directory, and SMF.

The patches for the shadow copy module can be found here:

http://www.edplese.com/samba-with-zfs.html

There are hopefully only a few minor changes that I need to make to them
before submitting them again to the Samba team.

I recently compiled the module for someone to use with Samba as shipped
with U4 and he reported that it worked well.  I've made the compiled
module available on this page as well if anyone is interested in testing
it.

The patch doesn't improve performance anymore in order to preserve
backwards compatibility with the existing module but adds usability
enhancements for both admins and end-users.  It allows shadow copy
functionality to "just work" with ZFS snapshots without having to create
symlinks to each snapshot in the root of each share.  For end-users it
allows the "Previous Versions" list to be sorted chronologically to make
it easier to use.  If performance is an issue the patch can be
modified to improve performance like the original patch did but this
only affects directory listings and is likely negligible in most cases.

> > > Is there any facility for managing ZFS remotely? We have a central 
> > > identity
> > > management system that automatically provisions resources as necessary for
> [...]
> > This is a loaded question.  There is a webconsole interface to ZFS which can
> > be run from most browsers.  But I think you'll find that the CLI is easier
> > for remote management.
> 
> Perhaps I should have been more clear -- a remote facility available via
> programmatic access, not manual user direct access. If I wanted to do
> something myself, I would absolutely login to the system and use the CLI.
> However, the question was regarding an automated process. For example, our
> Perl-based identity management system might create a user in the middle of
> the night based on the appearance in our authoritative database of that
> user's identity, and need to create a ZFS filesystem and quota for that
> user. So, I need to be able to manipulate ZFS remotely via a programmatic
> API.

While it won't help you in your case since your users access the files
using protocols other than CIFS, if you use only CIFS it's possible to
configure Samba to automatically create a user's home directory the
first time the user connects to the server.  This is done using the
"root preexec" share option in smb.conf and an example is provided at
the above URL.


Ed Plese


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Tim Spriggs wrote:

> Still, we are using ZFS but we are re-thinking on how to deploy/manage
> it. Our original model had us exporting/importing pools in order to move
> zone data between machines. We had done the same with UFS on iSCSI
[...]
> When we don't move pools around, zfs seems to be stable on both Solaris
> and OpenSolaris. I've done snapshots/rollbacks/sends/receives/clones/...

Sounds like your problems are in an area we probably wouldn't be delving
into... Thanks for the detail.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Mike Gerdts wrote:

> MS-DFS could be helpful here.  You could have a virtual samba instance
> that generates MS-DFS redirects to the appropriate spot.  At one point in

That's true, although I rather detest Microsoft DFS (they stole the acronym
from DCE/DFS, even though particularly the initial versions sucked
feature-wise in comparison). Also, the current release version of MacOS X
does not support CIFS DFS referrals. I'm not sure if the upcoming version
is going to rectify that or not. Windows clients not belonging to the
domain also occasionally have problems accessing shares across different
servers.

Although it is definitely something to consider if I'm going to be unable
to achieve my single namespace by having one large server...

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, Andy Lubel wrote:

> Yeah its fun to see IBM compete with its OEM provider Netapp.

Yes, we had both IBM and Netapp out as well. I'm not sure what the point
was... We do have some IBM SAN equipment on site, I suppose if we had gone
with the IBM variant we could have consolidated support.

> > sometimes it's more than just the raw storage... I wish I could just drop
> > in a couple of x4500's and not have to worry about the complexity of
> > clustering ...
> >
> zfs send/receive.

If I understand correctly, that would be sort of a poor man's replication?
So you would result with a physical copy on server2 of all of the data on
server1? What would you do when server1 crashed and died? One of the
benefits of a real cluster would be the automatic failover, and fail back
when the server recovered.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Fri, 21 Sep 2007, James F. Hranicky wrote:

> > It just seems rather involved, and relatively inefficient to continuously
> > be mounting/unmounting stuff all the time. One of the applications to be
> > deployed against the filesystem will be web service, I can't really
> > envision a web server with tens of thousands of NFS mounts coming and
> > going, seems like a lot of overhead.
>
> Well, that's why ZFS wouldn't work for us :-( .

Although, I'm just saying that from my gut -- does anyone have any actual
experience with automounting thousands of file systems? Does it work? Is it
horribly inefficient? Poor performance? Resource intensive?


> Makes sense -- in that case you would be looking at multiple SMB servers,
> though.

Yes, with again the resultant problem of worrying about where a user's
files are when they want to access them :(.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Thu, 20 Sep 2007, eric kustarz wrote:

> > As far as quotas, I was less than impressed with their implementation.
>
> Would you mind going into more details here?

The feature set was fairly extensive, they supported volume quotas for
users or groups, or "qtree" quotas, which similar to the ZFS quota would
limit space for a particular directory and all of its contents regardless
of user/group ownership.

But all quotas were set in a single flat text file. Anytime you added a new
quota, you needed to turn off quotas, then turn them back on, and quota
enforcement was disabled while it recalculated space utilization.

Like a lot of aspects of the filer, it seemed possibly functional but
rather kludgy. I hate kludgy :(. I'd have to go review the documentation to
recall the other issues I had with it, quotas were one of the last things
we reviewed and I'd about given up taking notes at that point.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Tim Spriggs
eric kustarz wrote:
>
> On Sep 21, 2007, at 3:50 PM, Tim Spriggs wrote:
>
>> m2# zpool create test mirror iscsi_lun1 iscsi_lun2
>> m2# zpool export test
>> m1# zpool import -f test
>> m1# reboot
>> m2# reboot
>
> Since I haven't actually looked into what problem caused your pools to 
> become damaged/lost, i can only guess that its possibly due to the 
> pool being actively imported on multiple machines (perhaps even 
> accidentally).
>
> If it is that, you'll be happy to note that we specifically no longer 
> that to happen (unless you use the -f flag):
> http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6282725
>
> Looks like it just missed the s10u4 cut off, but should be in s10_u5.
>
> In your above example, there should be no reason why you have to use 
> the '-f' flag on import (the pool was cleanly exported) - when you're 
> moving the pool from system to system, this can get you into trouble 
> if things don't go exactly how you planned.
>
> eric

That's a very possible prognosis. Even when the pools are exported from 
one system, they are still marked as attached (thus the -f was 
necessary). Since I rebooted both systems at the same time I guess it's 
possible that they both made claim to the pool and corrupted it.

I'm glad this will be fixed in the future.

-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread eric kustarz

On Sep 21, 2007, at 3:50 PM, Tim Spriggs wrote:

> Paul B. Henson wrote:
>> On Thu, 20 Sep 2007, Tim Spriggs wrote:
>>
>>
>>> The x4500 is very sweet and the only thing stopping us from  
>>> buying two
>>> instead of another shelf is the fact that we have lost pools on  
>>> Sol10u3
>>> servers and there is no easy way of making two pools redundant  
>>> (ie the
>>> complexity of clustering.) Simply sending incremental snapshots  
>>> is not a
>>> viable option.
>>>
>>> The pools we lost were pools on iSCSI (in a mirrored config) and  
>>> they
>>> were mostly lost on zpool import/export. The lack of a recovery
>>> mechanism really limits how much faith we can put into our data  
>>> on ZFS.
>>> It's safe as long as the pool is safe... but we've lost multiple  
>>> pools.
>>>
>>
>> Lost data doesn't give me a warm fuzzy 8-/. Were you running an  
>> officially
>> supported version of Solaris at the time? If so, what did Sun  
>> support have
>> to say about this issue?
>>
>
> Sol 10 with just about all patches up to date.
>
> I joined this list in hope of a good answer. After answering a few
> questions over two days I had no hope of recovering the data. Don't
> import/export (especially between systems) without serious cause, at
> least not with U3. I haven't tried updating our servers yet and I  
> don't
> intend to for a while now. The filesystems contained databases that  
> were
> luckily redundant and could be rebuilt, but our DBA was not too  
> happy to
> have to do that at 3:00am.
>
> I still have a pool that can not be mounted or exported. It shows up
> with zpool list but nothing under zfs list. zpool export gives me  
> an IO
> error and does nothing. On the next downtime I am going to attempt to
> yank the lun out from under its feet (as gently as I can) after I have
> stopped all other services.
>
> Still, we are using ZFS but we are re-thinking on how to deploy/manage
> it. Our original model had us exporting/importing pools in order to  
> move
> zone data between machines. We had done the same with UFS on iSCSI
> without a hitch. ZFS worked for about 8 zone moves and then killed 2
> zones. The major operational difference between the moves involved a
> reboot of the global zones. The initial import worked but after the
> reboot the pools were in a bad state reporting errors on both  
> drives in
> the mirror. I exported one (bad choice) and attempted to gain  
> access to
> the other. Now attempting to import the first pool will panic a
> solaris/opensolaris box very reliably. The second is in the state I
> described above. Also, the drive labels are intact according to zdb.
>
> When we don't move pools around, zfs seems to be stable on both  
> Solaris
> and OpenSolaris. I've done snapshots/rollbacks/sends/receives/ 
> clones/...
> without problems. We even have zvols exported as mirrored luns from an
> OpenSolaris box. It mirrors the luns that the IBM/NetApp box  
> exports and
> seems to be doing fine with that. There are a lot of other people that
> seem to have the same opinion and use zfs with direct attached  
> storage.
>
> -Tim
>
> PS: "when I have a lot of time" I might try to reproduce this by:
>
> m2# zpool create test mirror iscsi_lun1 iscsi_lun2
> m2# zpool export test
> m1# zpool import -f test
> m1# reboot
> m2# reboot
>

Since I haven't actually looked into what problem caused your pools  
to become damaged/lost, i can only guess that its possibly due to the  
pool being actively imported on multiple machines (perhaps even  
accidentally).

If it is that, you'll be happy to note that we specifically no longer  
that to happen (unless you use the -f flag):
http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6282725

Looks like it just missed the s10u4 cut off, but should be in s10_u5.

In your above example, there should be no reason why you have to use  
the '-f' flag on import (the pool was cleanly exported) - when you're  
moving the pool from system to system, this can get you into trouble  
if things don't go exactly how you planned.

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Tim Spriggs
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, Tim Spriggs wrote:
>
>   
>> The x4500 is very sweet and the only thing stopping us from buying two
>> instead of another shelf is the fact that we have lost pools on Sol10u3
>> servers and there is no easy way of making two pools redundant (ie the
>> complexity of clustering.) Simply sending incremental snapshots is not a
>> viable option.
>>
>> The pools we lost were pools on iSCSI (in a mirrored config) and they
>> were mostly lost on zpool import/export. The lack of a recovery
>> mechanism really limits how much faith we can put into our data on ZFS.
>> It's safe as long as the pool is safe... but we've lost multiple pools.
>> 
>
> Lost data doesn't give me a warm fuzzy 8-/. Were you running an officially
> supported version of Solaris at the time? If so, what did Sun support have
> to say about this issue?
>   

Sol 10 with just about all patches up to date.

I joined this list in hope of a good answer. After answering a few 
questions over two days I had no hope of recovering the data. Don't 
import/export (especially between systems) without serious cause, at 
least not with U3. I haven't tried updating our servers yet and I don't 
intend to for a while now. The filesystems contained databases that were 
luckily redundant and could be rebuilt, but our DBA was not too happy to 
have to do that at 3:00am.

I still have a pool that can not be mounted or exported. It shows up 
with zpool list but nothing under zfs list. zpool export gives me an IO 
error and does nothing. On the next downtime I am going to attempt to 
yank the lun out from under its feet (as gently as I can) after I have 
stopped all other services.

Still, we are using ZFS but we are re-thinking on how to deploy/manage 
it. Our original model had us exporting/importing pools in order to move 
zone data between machines. We had done the same with UFS on iSCSI 
without a hitch. ZFS worked for about 8 zone moves and then killed 2 
zones. The major operational difference between the moves involved a 
reboot of the global zones. The initial import worked but after the 
reboot the pools were in a bad state reporting errors on both drives in 
the mirror. I exported one (bad choice) and attempted to gain access to 
the other. Now attempting to import the first pool will panic a 
solaris/opensolaris box very reliably. The second is in the state I 
described above. Also, the drive labels are intact according to zdb.

When we don't move pools around, zfs seems to be stable on both Solaris 
and OpenSolaris. I've done snapshots/rollbacks/sends/receives/clones/... 
without problems. We even have zvols exported as mirrored luns from an 
OpenSolaris box. It mirrors the luns that the IBM/NetApp box exports and 
seems to be doing fine with that. There are a lot of other people that 
seem to have the same opinion and use zfs with direct attached storage.

-Tim

PS: "when I have a lot of time" I might try to reproduce this by:

m2# zpool create test mirror iscsi_lun1 iscsi_lun2
m2# zpool export test
m1# zpool import -f test
m1# reboot
m2# reboot
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Paul B. Henson
On Thu, 20 Sep 2007, Tim Spriggs wrote:

> The x4500 is very sweet and the only thing stopping us from buying two
> instead of another shelf is the fact that we have lost pools on Sol10u3
> servers and there is no easy way of making two pools redundant (ie the
> complexity of clustering.) Simply sending incremental snapshots is not a
> viable option.
>
> The pools we lost were pools on iSCSI (in a mirrored config) and they
> were mostly lost on zpool import/export. The lack of a recovery
> mechanism really limits how much faith we can put into our data on ZFS.
> It's safe as long as the pool is safe... but we've lost multiple pools.

Lost data doesn't give me a warm fuzzy 8-/. Were you running an officially
supported version of Solaris at the time? If so, what did Sun support have
to say about this issue?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Mike Gerdts
On 9/20/07, Paul B. Henson <[EMAIL PROTECTED]> wrote:
> Again though, that would imply two different storage locations visible to
> the clients? I'd really rather avoid that. For example, with our current
> Samba implementation, a user can just connect to
> '\\files.csupomona.edu\' to access their home directory or
> '\\files.csupomona.edu\' to access a shared group directory.
> They don't need to worry on which physical server it resides or determine
> what server name to connect to.

MS-DFS could be helpful here.  You could have a virtual samba instance
that generates MS-DFS redirects to the appropriate spot.  At one point
in the past I wrote a script (long since lost - at a different job)
that would automatically convert automounter maps into the
appropriately formatted symbolic links used by the Samba MS-DFS
implementation.  It worked quite well for giving one place to
administer the location mapping while providing transparency to the
end-users.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread James F. Hranicky
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, James F. Hranicky wrote:
> 
>> This can be solved using an automounter as well.
> 
> Well, I'd say more "kludged around" than "solved" ;), but again unless
> you've used DFS it might not seem that way.

Hey, I liked it :->

> It just seems rather involved, and relatively inefficient to continuously
> be mounting/unmounting stuff all the time. One of the applications to be
> deployed against the filesystem will be web service, I can't really
> envision a web server with tens of thousands of NFS mounts coming and
> going, seems like a lot of overhead.

Well, that's why ZFS wouldn't work for us :-( .

> I might need to pursue a similar route though if I can't get one large
> system to house everything in one place.
> 
>> Samba can be configured to map homes drives to /nfs/home/%u . Let samba use
>> the automounter setup and it's just as transparent on the CIFS side.
> 
> I'm planning to use NFSv4 with strong authentication and authorization
> through, and intended to run Samba directly on the file server itself
> accessing storage locally. I'm not sure that Samba would be able to acquire
> local Kerberos credentials and switch between them for the users, without
> that access via NFSv4 isn't very doable.

Makes sense -- in that case you would be looking at multiple SMB
servers, though.

Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Andy Lubel



On 9/20/07 7:31 PM, "Paul B. Henson" <[EMAIL PROTECTED]> wrote:

> On Thu, 20 Sep 2007, Tim Spriggs wrote:
> 
>> It's an IBM re-branded NetApp which can which we are using for NFS and
>> iSCSI.

Yeah its fun to see IBM compete with its OEM provider Netapp.
> 
> Ah, I see.
> 
> Is it comparable storage though? Does it use SATA drives similar to the
> x4500, or more expensive/higher performance FC drives? Is it one of the
> models that allows connecting dual clustered heads and failing over the
> storage between them?
> 
> I agree the x4500 is a sweet looking box, but when making price comparisons
> sometimes it's more than just the raw storage... I wish I could just drop
> in a couple of x4500's and not have to worry about the complexity of
> clustering ...
> 
> 
zfs send/receive.


Netapp is great, we have about 6 varieties in production here. But what I
pay in maintenance and up front cost on just 2 filers,  I can buy a x4500 a
year, and have a 3 year warranty each time I buy.  It just depends on the
company you work for.

I haven't played too much with anything but netapp and storagetek.. But once
I got started on zfs I just knew it was the future; and I think netapp
realizes that too.  And if apple does what I think it will, it will only get
better :)

Fast, Cheap, Easy - you only get 2.  Zfs may change that.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread eric kustarz

On Sep 20, 2007, at 6:46 PM, Paul B. Henson wrote:

> On Thu, 20 Sep 2007, Gary Mills wrote:
>
>> You should consider a Netapp filer.  It will do both NFS and CIFS,
>> supports disk quotas, and is highly reliable.  We use one for 30,000
>> students and 3000 employees.  Ours has never failed us.
>
> We had actually just finished evaluating Netapp before I started  
> looking
> into Solaris/ZFS. For a variety of reasons, it was not suitable to our
> requirements.
>



> As far as quotas, I was less than impressed with their implementation.

Would you mind going into more details here?

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Tim Spriggs
Paul B. Henson wrote:
> Is it comparable storage though? Does it use SATA drives similar to the
> x4500, or more expensive/higher performance FC drives? Is it one of the
> models that allows connecting dual clustered heads and failing over the
> storage between them?
>
> I agree the x4500 is a sweet looking box, but when making price comparisons
> sometimes it's more than just the raw storage... I wish I could just drop
> in a couple of x4500's and not have to worry about the complexity of
> clustering ...
>   

It is configured with SATA drives and does support failover for NFS. 
iSCSI is another story at the moment.

The x4500 is very sweet and the only thing stopping us from buying two 
instead of another shelf is the fact that we have lost pools on Sol10u3 
servers and there is no easy way of making two pools redundant (ie the 
complexity of clustering.) Simply sending incremental snapshots is not a 
viable option.

The pools we lost were pools on iSCSI (in a mirrored config) and they 
were mostly lost on zpool import/export. The lack of a recovery 
mechanism really limits how much faith we can put into our data on ZFS. 
It's safe as long as the pool is safe... but we've lost multiple pools.

-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Chris Kirby wrote:

> We're adding a style of quota that only includes the bytes referenced by
> the active fs.  Also, there will be a matching style for reservations.
>
> "some point in the future" is very soon (weeks).  :-)

I don't think my management will let me run Solaris Express on a production
server ;), how does that translate into availability into a
released/supported version? Would that be something released as a patch to
the just made available U4, or delayed until the next complete update
release?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Tim Spriggs wrote:

> It's an IBM re-branded NetApp which can which we are using for NFS and
> iSCSI.

Ah, I see.

Is it comparable storage though? Does it use SATA drives similar to the
x4500, or more expensive/higher performance FC drives? Is it one of the
models that allows connecting dual clustered heads and failing over the
storage between them?

I agree the x4500 is a sweet looking box, but when making price comparisons
sometimes it's more than just the raw storage... I wish I could just drop
in a couple of x4500's and not have to worry about the complexity of
clustering ...



-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Chris Kirby
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, James F. Hranicky wrote:
> 
> 
>>and due to the fact that snapshots counted toward ZFS quota, I decided
> 
> 
> Yes, that does seem to remove a bit of their value for backup purposes. I
> think they're planning to rectify that at some point in the future.

We're adding a style of quota that only includes the bytes
referenced by the active fs.  Also, there will be a matching
style for reservations.

"some point in the future" is very soon (weeks).  :-)

-Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Tim Spriggs
Paul B. Henson wrote:
> On Thu, 20 Sep 2007, Tim Spriggs wrote:
>
>   
>> We are in a similar situation. It turns out that buying two thumpers is
>> cheaper per TB than buying more shelves for an IBM N7600. I don't know
>> about power/cooling considerations yet though.
>> 
>
> It's really a completely different class of storage though, right? I don't
> know offhand what an IBM N7600 is, but presumably some type of SAN device?
> Which can be connected simultaneously to multiple servers for clustering?
>
> An x4500 looks great if you only want a bunch of storage with the
> reliability/availability provided by a relatively fault-tolerant server.
> But if you want to be able to withstand server failure, or continue to
> provide service while having one server down for maintenance/patching, it
> doesn't seem appropriate.
>
>
>   

It's an IBM re-branded NetApp which can which we are using for NFS and 
iSCSI.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Gary Mills wrote:

> You should consider a Netapp filer.  It will do both NFS and CIFS,
> supports disk quotas, and is highly reliable.  We use one for 30,000
> students and 3000 employees.  Ours has never failed us.

We had actually just finished evaluating Netapp before I started looking
into Solaris/ZFS. For a variety of reasons, it was not suitable to our
requirements.

One, for example, was that it did not support simultaneous operation in an
MIT Kerberos realm for NFS authentication while at the same time belonging
to an active directory domain for CIFS authentication. Their workaround was
to have the filer behave like an NT4 server rather than a Windows 2000+
server, which seemed pretty stupid. That also resulted in the filer
not supporting NTLMv2, which was unacceptable.

Another issue we had was with access control. Their approach to ACLs was
just flat out ridiculous. You had UNIX mode bits, NFSv4 ACLs, and CIFs
ACLs, all disjoint, and which one was actually being used and how they
interacted was extremely confusing and not even accurately documented. We
wanted to be able to have the exact same permissions applied whether via
NFSv4 or CIFs, and ideally allow changing permissions via either access
protocol. That simply wasn't going to happen with Netapp.

Their Kerberos implementation only supported DES, not 3DES or AES, their
LDAP integration only supported the legacy posixGroup/memberUid attribute
as opposed to the more modern groupOfNames/member attribute for group
membership.

They have some type of remote management API, but it just wasn't very clean
IMHO.

As far as quotas, I was less than impressed with their implementation.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Tim Spriggs wrote:

> We are in a similar situation. It turns out that buying two thumpers is
> cheaper per TB than buying more shelves for an IBM N7600. I don't know
> about power/cooling considerations yet though.

It's really a completely different class of storage though, right? I don't
know offhand what an IBM N7600 is, but presumably some type of SAN device?
Which can be connected simultaneously to multiple servers for clustering?

An x4500 looks great if you only want a bunch of storage with the
reliability/availability provided by a relatively fault-tolerant server.
But if you want to be able to withstand server failure, or continue to
provide service while having one server down for maintenance/patching, it
doesn't seem appropriate.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Andy Lubel wrote:

> Looks like its completely scalable but your boot time may suffer the more
> you have. Just don't reboot :)

I'm not sure if it's accurate, but the SE we were meeting with claimed that
we could failover all of the filesystems to one half of the cluster, reboot
the other half, fail them back, reboot the first half, and have rebooted
both cluster members with no downtime. I guess as long as the active
cluster member does not fail during the potentially lengthy downtime of the
one rebooting.

> If it was so great why did IBM kill it?

I often daydreamed of a group of high-level IBM executives tied to chairs
next to a table filled with rubber hoses ;), for the sole purpose of
getting that answer.

I think they killed it because the market of technically knowledgeable and
capable people that were able to use it to its full capacity was relatively
limited, and the average IT shop was happy with Windoze :(.

> Did they have an alternative with the same functionality?

No, not really. Depending on your situation, they recommended
transitioning to GPFS or NFSv4, but neither really met the same needs as
DFS.


> I really have to disagree, we have 6120 and 6130's and if I had the option
> to actually plan out some storage I would have just bought a thumper.  You
> could probably buy 2 for the cost of that 6140.

Thumper = x4500, right? You can't really cluster the internal storage of an
x4500, so assuming high reliability/availability was a requirement that
sort of rules that box out.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, James F. Hranicky wrote:

> This can be solved using an automounter as well.

Well, I'd say more "kludged around" than "solved" ;), but again unless
you've used DFS it might not seem that way.

It just seems rather involved, and relatively inefficient to continuously
be mounting/unmounting stuff all the time. One of the applications to be
deployed against the filesystem will be web service, I can't really
envision a web server with tens of thousands of NFS mounts coming and
going, seems like a lot of overhead.

I might need to pursue a similar route though if I can't get one large
system to house everything in one place.

> Samba can be configured to map homes drives to /nfs/home/%u . Let samba use
> the automounter setup and it's just as transparent on the CIFS side.

I'm planning to use NFSv4 with strong authentication and authorization
through, and intended to run Samba directly on the file server itself
accessing storage locally. I'm not sure that Samba would be able to acquire
local Kerberos credentials and switch between them for the users, without
that access via NFSv4 isn't very doable.

> and due to the fact that snapshots counted toward ZFS quota, I decided

Yes, that does seem to remove a bit of their value for backup purposes. I
think they're planning to rectify that at some point in the future.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Dickon Hood
On Thu, Sep 20, 2007 at 16:22:45 -0500, Gary Mills wrote:

: You should consider a Netapp filer.  It will do both NFS and CIFS,
: supports disk quotas, and is highly reliable.  We use one for 30,000
: students and 3000 employees.  Ours has never failed us.

And they might only lightly sue you for contemplating zfs if you're
really, really lucky...

-- 
Dickon Hood

Due to digital rights management, my .sig is temporarily unavailable.
Normal service will be resumed as soon as possible.  We apologise for the
inconvenience in the meantime.

No virus was found in this outgoing message as I didn't bother looking.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Gary Mills
On Thu, Sep 20, 2007 at 12:49:29PM -0700, Paul B. Henson wrote:
> On Thu, 20 Sep 2007, Richard Elling wrote:
> 
> > 50,000 directories aren't a problem, unless you also need 50,000 quotas
> > and hence 50,000 file systems.  Such a large, single storage pool system
> > will be an outlier... significantly beyond what we have real world
> > experience with.
> 
> Hmm, I get a bit of a shiver down my spine at the prospect of deploying a
> critical central service in a relatively untested configuration 8-/. What
> is the maximum number of file systems in a given pool that has undergone
> some reasonable amount of real world deployment?

You should consider a Netapp filer.  It will do both NFS and CIFS,
supports disk quotas, and is highly reliable.  We use one for 30,000
students and 3000 employees.  Ours has never failed us.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Tim Spriggs
Andy Lubel wrote:
> On 9/20/07 3:49 PM, "Paul B. Henson" <[EMAIL PROTECTED]> wrote:
>
>   
>> On Thu, 20 Sep 2007, Richard Elling wrote:
>>
>> 
>> That would also be my preference, but if I were forced to use hardware
>> RAID, the additional loss of storage for ZFS redundancy would be painful.
>>
>> Would anyone happen to have any good recommendations for an enterprise
>> scale storage subsystem suitable for ZFS deployment? If I recall correctly,
>> the SE we spoke with recommended the StorageTek 6140 in a hardware raid
>> configuration, and evidently mistakenly claimed that Cluster would not work
>> with JBOD.
>> 
>
> I really have to disagree, we have 6120 and 6130's and if I had the option
> to actually plan out some storage I would have just bought a thumper.  You
> could probably buy 2 for the cost of that 6140.
>   

We are in a similar situation. It turns out that buying two thumpers is 
cheaper per TB than buying more shelves for an IBM N7600. I don't know 
about power/cooling considerations yet though.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Andy Lubel
On 9/20/07 3:49 PM, "Paul B. Henson" <[EMAIL PROTECTED]> wrote:

> On Thu, 20 Sep 2007, Richard Elling wrote:
> 
>> 50,000 directories aren't a problem, unless you also need 50,000 quotas
>> and hence 50,000 file systems.  Such a large, single storage pool system
>> will be an outlier... significantly beyond what we have real world
>> experience with.
> 
> Yes, considering that 45,000 of those users will be students, we definitely
> need separate quotas for each one :).
> 
> Hmm, I get a bit of a shiver down my spine at the prospect of deploying a
> critical central service in a relatively untested configuration 8-/. What
> is the maximum number of file systems in a given pool that has undergone
> some reasonable amount of real world deployment?

15,500 is the most I see in this article:

http://developers.sun.com/solaris/articles/nfs_zfs.html

Looks like its completely scalable but your boot time may suffer the more
you have. Just don't reboot :)

> 
> One issue I have is that our previous filesystem, DFS, completely spoiled
> me with its global namespace and location transparency. We had three fairly
> large servers, with the content evenly dispersed among them, but from the
> perspective of the client any user's files were available at
> /dfs/user/, regardless of which physical server they resided on.
> We could even move them around between servers transparently.

If it was so great why did IBM kill it?  Did they have an alternative with
the same functionality?

> 
> Unfortunately, there aren't really any filesystems available with similar
> features and enterprise applicability. OpenAFS comes closest, we've been
> prototyping that but the lack of per file ACLs bites, and as an add-on
> product we've had issues with kernel compatibility across upgrades.
> 
> I was hoping to replicate a similar feel by just having one large file
> server with all the data on it. If I split our user files across multiple
> servers, we would have to worry about which server contained what files,
> which would be rather annoying.
> 
> There are some features in NFSv4 that seem like they might someday help
> resolve this problem, but I don't think they are readily available in
> servers and definitely not in the common client.
> 
>>> I was planning to provide CIFS services via Samba. I noticed a posting a
>>> while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
>>> into Samba, but I'm not sure if that was ever completed and shipped either
>>> in the Sun version or pending inclusion in the official version, does
>>> anyone happen to have an update on that? Also, I saw a patch proposing a
>>> different implementation of shadow copies that better supported ZFS
>>> snapshots, any thoughts on that would also be appreciated.
>> 
>> This work is done and, AFAIK, has been integrated into S10 8/07.
> 
> Excellent. I did a little further research myself on the Samba mailing
> lists, and it looks like ZFS ACL support was merged into the official
> 3.0.26 release. Unfortunately, the patch to improve shadow copy performance
> on top of ZFS still appears to be floating around the technical mailing
> list under discussion.
> 
>>> Is there any facility for managing ZFS remotely? We have a central identity
>>> management system that automatically provisions resources as necessary for
> [...]
>> This is a loaded question.  There is a webconsole interface to ZFS which can
>> be run from most browsers.  But I think you'll find that the CLI is easier
>> for remote management.
> 
> Perhaps I should have been more clear -- a remote facility available via
> programmatic access, not manual user direct access. If I wanted to do
> something myself, I would absolutely login to the system and use the CLI.
> However, the question was regarding an automated process. For example, our
> Perl-based identity management system might create a user in the middle of
> the night based on the appearance in our authoritative database of that
> user's identity, and need to create a ZFS filesystem and quota for that
> user. So, I need to be able to manipulate ZFS remotely via a programmatic
> API.
>
>> Active/passive only.  ZFS is not supported over pxfs and ZFS cannot be
>> mounted simultaneously from two different nodes.
> 
> That's what I thought, I'll have to get back to that SE. Makes me wonder as
> to the reliability of his other answers :).
> 
>> For most large file servers, people will split the file systems across
>> servers such that under normal circumstances, both nodes are providing
>> file service.  This implies two or more storage pools.
> 
> Again though, that would imply two different storage locations visible to
> the clients? I'd really rather avoid that. For example, with our current
> Samba implementation, a user can just connect to
> '\\files.csupomona.edu\' to access their home directory or
> '\\files.csupomona.edu\' to access a shared group directory.
> They don't need to worry on which physical server it resides or dete

Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread James F. Hranicky
Paul B. Henson wrote:

> One issue I have is that our previous filesystem, DFS, completely spoiled
> me with its global namespace and location transparency. We had three fairly
> large servers, with the content evenly dispersed among them, but from the
> perspective of the client any user's files were available at
> /dfs/user/, regardless of which physical server they resided on.
> We could even move them around between servers transparently.

This can be solved using an automounter as well. All home directories
are specified as

/nfs/home/user

in the passwd map, then have a homes map that maps

/nfs/home/user -> /nfs/homeXX/user

then have a map that maps

/nfs/homeXX-> serverXX:/export/homeXX

You can have any number of servers serving up any number of homes
filesystems. Moving users between servers means only changing the
mapping in the homes map. The user never knows the difference, only
seeing the homedir as

/nfs/home/user

(we used amd)

> Again though, that would imply two different storage locations visible to
> the clients? I'd really rather avoid that. For example, with our current
> Samba implementation, a user can just connect to
> '\\files.csupomona.edu\' to access their home directory or
> '\\files.csupomona.edu\' to access a shared group directory.
> They don't need to worry on which physical server it resides or determine
> what server name to connect to.

Samba can be configured to map homes drives to /nfs/home/%u . Let samba use
the automounter setup and it's just as transparent on the CIFS side.

This is how we had things set up at my previous place of employment and
it worked extremely well. Unfortunately, due to lack of BSD-style quotas
and due to the fact that snapshots counted toward ZFS quota, I decided
against using ZFS for filesystem service -- the automounter setup cannot
mitigate the bunches-of-little-filesystems problem.

Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Paul B. Henson
On Thu, 20 Sep 2007, Richard Elling wrote:

> 50,000 directories aren't a problem, unless you also need 50,000 quotas
> and hence 50,000 file systems.  Such a large, single storage pool system
> will be an outlier... significantly beyond what we have real world
> experience with.

Yes, considering that 45,000 of those users will be students, we definitely
need separate quotas for each one :).

Hmm, I get a bit of a shiver down my spine at the prospect of deploying a
critical central service in a relatively untested configuration 8-/. What
is the maximum number of file systems in a given pool that has undergone
some reasonable amount of real world deployment?

One issue I have is that our previous filesystem, DFS, completely spoiled
me with its global namespace and location transparency. We had three fairly
large servers, with the content evenly dispersed among them, but from the
perspective of the client any user's files were available at
/dfs/user/, regardless of which physical server they resided on.
We could even move them around between servers transparently.

Unfortunately, there aren't really any filesystems available with similar
features and enterprise applicability. OpenAFS comes closest, we've been
prototyping that but the lack of per file ACLs bites, and as an add-on
product we've had issues with kernel compatibility across upgrades.

I was hoping to replicate a similar feel by just having one large file
server with all the data on it. If I split our user files across multiple
servers, we would have to worry about which server contained what files,
which would be rather annoying.

There are some features in NFSv4 that seem like they might someday help
resolve this problem, but I don't think they are readily available in
servers and definitely not in the common client.

> > I was planning to provide CIFS services via Samba. I noticed a posting a
> > while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
> > into Samba, but I'm not sure if that was ever completed and shipped either
> > in the Sun version or pending inclusion in the official version, does
> > anyone happen to have an update on that? Also, I saw a patch proposing a
> > different implementation of shadow copies that better supported ZFS
> > snapshots, any thoughts on that would also be appreciated.
>
> This work is done and, AFAIK, has been integrated into S10 8/07.

Excellent. I did a little further research myself on the Samba mailing
lists, and it looks like ZFS ACL support was merged into the official
3.0.26 release. Unfortunately, the patch to improve shadow copy performance
on top of ZFS still appears to be floating around the technical mailing
list under discussion.

> > Is there any facility for managing ZFS remotely? We have a central identity
> > management system that automatically provisions resources as necessary for
[...]
> This is a loaded question.  There is a webconsole interface to ZFS which can
> be run from most browsers.  But I think you'll find that the CLI is easier
> for remote management.

Perhaps I should have been more clear -- a remote facility available via
programmatic access, not manual user direct access. If I wanted to do
something myself, I would absolutely login to the system and use the CLI.
However, the question was regarding an automated process. For example, our
Perl-based identity management system might create a user in the middle of
the night based on the appearance in our authoritative database of that
user's identity, and need to create a ZFS filesystem and quota for that
user. So, I need to be able to manipulate ZFS remotely via a programmatic
API.

> Active/passive only.  ZFS is not supported over pxfs and ZFS cannot be
> mounted simultaneously from two different nodes.

That's what I thought, I'll have to get back to that SE. Makes me wonder as
to the reliability of his other answers :).

> For most large file servers, people will split the file systems across
> servers such that under normal circumstances, both nodes are providing
> file service.  This implies two or more storage pools.

Again though, that would imply two different storage locations visible to
the clients? I'd really rather avoid that. For example, with our current
Samba implementation, a user can just connect to
'\\files.csupomona.edu\' to access their home directory or
'\\files.csupomona.edu\' to access a shared group directory.
They don't need to worry on which physical server it resides or determine
what server name to connect to.

> The SE is mistaken.  Sun^H^Holaris Cluster supports a wide variety of
> JBOD and RAID array solutions.  For ZFS, I recommend a configuration
> which allows ZFS to repair corrupted data.

That would also be my preference, but if I were forced to use hardware
RAID, the additional loss of storage for ZFS redundancy would be painful.

Would anyone happen to have any good recommendations for an enterprise
scale storage subsystem suitable for ZFS deployment? If I recall c

Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Richard Elling
a few comments below...

Paul B. Henson wrote:
> We are looking for a replacement enterprise file system to handle storage
> needs for our campus. For the past 10 years, we have been happily using DFS
> (the distributed file system component of DCE), but unfortunately IBM
> killed off that product and we have been running without support for over a
> year now. We have looked at a variety of possible options, none of which
> have proven fruitful. We are currently investigating the possibility of a
> Solaris 10/ZFS implementation. I have done a fair amount of reading and
> perusal of the mailing list archives, but I apologize in advance if I ask
> anything I should have already found in a FAQ or other repository.
> 
> Basically, we are looking to provide initially 5 TB of usable storage,
> potentially scaling up to 25-30TB of usable storage after successful
> initial deployment. We would have approximately 50,000 user home
> directories and perhaps 1000 shared group storage directories. Access to
> this storage would be via NFSv4 for our UNIX infrastructure, and CIFS for
> those annoying Windows systems you just can't seem to get rid of ;).

50,000 directories aren't a problem, unless you also need 50,000 quotas and
hence 50,000 file systems.  Such a large, single storage pool system will
be an outlier... significantly beyond what we have real world experience
with.

> I read that initial versions of ZFS had scalability issues with such a
> large number of file systems, resulting in extremely long boot times and
> other problems. Supposedly a lot of those problems have been fixed in the
> latest versions of OpenSolaris, and many of the fixes have been backported
> to the official Solaris 10 update 4? Will that version of Solaris
> reasonably support 50 odd thousand ZFS file systems?

There have been improvements in performance and usability.  Not all
performance problems were in ZFS, but large numbers of file systems exposed
other problems.  However, I don't think that this has been characterized.

> I saw a couple of threads in the mailing list archives regarding NFS not
> transitioning file system boundaries, requiring each and every ZFS
> filesystem (50 thousand-ish in my case) to be exported and mounted on the
> client separately. While that might be feasible with an automounter, it
> doesn't really seem desirable or efficient. It would be much nicer to
> simply have one mount point on the client with all the home directories
> available underneath it. I was wondering whether or not that would be
> possible with the NFSv4 pseudo-root feature. I saw one posting that
> indicated it might be, but it wasn't clear whether or not that was a
> current feature or something yet to be implemented. I have no requirements
> to support legacy NFSv2/3 systems, so a solution only available via NFSv4
> would be acceptable.
> 
> I was planning to provide CIFS services via Samba. I noticed a posting a
> while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
> into Samba, but I'm not sure if that was ever completed and shipped either
> in the Sun version or pending inclusion in the official version, does
> anyone happen to have an update on that? Also, I saw a patch proposing a
> different implementation of shadow copies that better supported ZFS
> snapshots, any thoughts on that would also be appreciated.

This work is done and, AFAIK, has been integrated into S10 8/07.

> Is there any facility for managing ZFS remotely? We have a central identity
> management system that automatically provisions resources as necessary for
> users, as well as providing an interface for helpdesk staff to modify
> things such as quota. I'd be willing to implement some type of web service
> on the actual server if there is no native remote management; in that case,
> is there any way to directly configure ZFS via a programmatic API, as
> opposed to running binaries and parsing the output? Some type of perl
> module would be perfect.

This is a loaded question.  There is a webconsole interface to ZFS which can
be run from most browsers.  But I think you'll find that the CLI is easier
for remote management.

> We need high availability, so are looking at Sun Cluster. That seems to add
> an extra layer of complexity , but there's no way I'll get signoff on
> a solution without redundancy. It would appear that ZFS failover is
> supported with the latest version of Solaris/Sun Cluster? I was speaking
> with a Sun SE who claimed that ZFS would actually operate active/active in
> a cluster, simultaneously writable by both nodes. From what I had read, ZFS
> is not a cluster file system, and would only operate in the active/passive
> failover capacity. Any comments?

Active/passive only.  ZFS is not supported over pxfs and ZFS cannot be
mounted simultaneously from two different nodes.

For most large file servers, people will split the file systems across
servers such that under normal circumstances, both nodes are providing
file

[zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-19 Thread Paul B. Henson

We are looking for a replacement enterprise file system to handle storage
needs for our campus. For the past 10 years, we have been happily using DFS
(the distributed file system component of DCE), but unfortunately IBM
killed off that product and we have been running without support for over a
year now. We have looked at a variety of possible options, none of which
have proven fruitful. We are currently investigating the possibility of a
Solaris 10/ZFS implementation. I have done a fair amount of reading and
perusal of the mailing list archives, but I apologize in advance if I ask
anything I should have already found in a FAQ or other repository.

Basically, we are looking to provide initially 5 TB of usable storage,
potentially scaling up to 25-30TB of usable storage after successful
initial deployment. We would have approximately 50,000 user home
directories and perhaps 1000 shared group storage directories. Access to
this storage would be via NFSv4 for our UNIX infrastructure, and CIFS for
those annoying Windows systems you just can't seem to get rid of ;).

I read that initial versions of ZFS had scalability issues with such a
large number of file systems, resulting in extremely long boot times and
other problems. Supposedly a lot of those problems have been fixed in the
latest versions of OpenSolaris, and many of the fixes have been backported
to the official Solaris 10 update 4? Will that version of Solaris
reasonably support 50 odd thousand ZFS file systems?

I saw a couple of threads in the mailing list archives regarding NFS not
transitioning file system boundaries, requiring each and every ZFS
filesystem (50 thousand-ish in my case) to be exported and mounted on the
client separately. While that might be feasible with an automounter, it
doesn't really seem desirable or efficient. It would be much nicer to
simply have one mount point on the client with all the home directories
available underneath it. I was wondering whether or not that would be
possible with the NFSv4 pseudo-root feature. I saw one posting that
indicated it might be, but it wasn't clear whether or not that was a
current feature or something yet to be implemented. I have no requirements
to support legacy NFSv2/3 systems, so a solution only available via NFSv4
would be acceptable.

I was planning to provide CIFS services via Samba. I noticed a posting a
while back from a Sun engineer working on integrating NFSv4/ZFS ACL support
into Samba, but I'm not sure if that was ever completed and shipped either
in the Sun version or pending inclusion in the official version, does
anyone happen to have an update on that? Also, I saw a patch proposing a
different implementation of shadow copies that better supported ZFS
snapshots, any thoughts on that would also be appreciated.

Is there any facility for managing ZFS remotely? We have a central identity
management system that automatically provisions resources as necessary for
users, as well as providing an interface for helpdesk staff to modify
things such as quota. I'd be willing to implement some type of web service
on the actual server if there is no native remote management; in that case,
is there any way to directly configure ZFS via a programmatic API, as
opposed to running binaries and parsing the output? Some type of perl
module would be perfect.

We need high availability, so are looking at Sun Cluster. That seems to add
an extra layer of complexity , but there's no way I'll get signoff on
a solution without redundancy. It would appear that ZFS failover is
supported with the latest version of Solaris/Sun Cluster? I was speaking
with a Sun SE who claimed that ZFS would actually operate active/active in
a cluster, simultaneously writable by both nodes. From what I had read, ZFS
is not a cluster file system, and would only operate in the active/passive
failover capacity. Any comments?

The SE also told me that Sun Cluster requires hardware raid, which
conflicts with the general recommendation to feed ZFS raw disk. It seems
such a configuration would either require configuring zdevs directly on the
raid LUNs, losing ZFS self-healing and checksum correction features, or
losing space to not only the hardware raid level, but a partially redundant
ZFS level as well. What is the general consensus on the best way to deploy
ZFS under a cluster using hardware raid?

Any other thoughts/comments on the feasibility or practicality of a
large-scale ZFS deployment like this?

Thanks much...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss