Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-11 Thread David Magda

On Sep 10, 2007, at 13:40, [EMAIL PROTECTED] wrote:

 I am not against refactoring solutions,  but zfs quotas and the  
 lack of
 user quotas in general either leave people trying to use zfs quotas  
 in lieu
 of user quotas, suggesting weak end runs against the problem (a  
 cron to
 calculate hogs), or belittling the need to actually limit disk  
 usage per
 user id.

And let's not forget group ID

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Stephen Usher wrote:
 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could 
 outline the gotchas so that the rest of us don't have to re-invent the 
 wheel... or at least not fall down the pitfalls.
   

I believe I ran into one or both of these bugs:

6429996 zvols don't reserve enough space for requisite meta data
6430003 record size needs to affect zvol reservation size on RAID-Z

Basically what happened was that the zpool filled to 100% and broke UFS 
with 'no space left on device' errors. This was quite strange to sort 
out since the UFS zvol had 30GB of free space.

I never got any replies to my request for more info and/or workarounds 
for the above bugs. My workaround and recommendation is to leave a 
'healthy' amount of un-allocated space in the zpool. I don't know what a 
good level for 'healthy' is. Currently I've left about 1% (2GB) on a 
200GB raid-z pool.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Mike Gerdts wrote:
 The UFS on zvols option sounds intriguing to me, but I would guess
 that the following could be problems:

 1) Double buffering:  Will ZFS store data in the ARC while UFS uses
 traditional file system buffers?
   
This is probably an issue. You also have the journal+COW combination 
issue. I'm guessing that both would be performance concerns. My 
application is relatively low bandwidth, so I haven't dug deep into this 
area.
 2) Boot order dependencies.  How does the startup of zfs compare to
 processing of /etc/vfstab?  I would guess that this is OK due to
 legacy mount type supported by zfs.  If this is OK, then dfstab
 processing is probably OK.
Zvols by nature are not available under ZFS automatic mounting. You 
would need to add the /dev/zvol/dsk/... lines to /etc/vfstab just as you 
would for any other /dev/dsk... or /dev/md/dsk/... devices.

If you are not using the z_pool_ for anything else, I would remove the 
automatic mount point for it.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Brian H. Nelson
Stephen Usher wrote:

 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could 
 outline the gotchas so that the rest of us don't have to re-invent the 
 wheel... or at least not fall down the pitfalls.
   
Also, here's a link to the ufs on zvol blog where I originally found the 
idea:

http://blogs.sun.com/scottdickson/entry/fun_with_zvols_-_ufs

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Richard Elling
Mike Gerdts wrote:
 On 9/8/07, Richard Elling [EMAIL PROTECTED] wrote:
 Changing the topic slightly, the strategic question is:
 why are you providing disk space to students?
 
 For most programming and productivity (e.g. word processing, etc.)
 people will likely be better suited by having network access for their
 personal equipment with local storage.

Most students today are carrying around more storage in their pocket
than they'll get from the university.

 For cases when specialized expensive tools ($10k + per seat) are used,
 it is not practical to install them on hundreds or thousands of
 personal devices for a semester or two of work.  The typical computing
 lab that provides such tools is not well equipped to deal with
 removable media such as flash drives.

I disagree, any lab machine bought in the past 5 years or so has a USB
port, even SunRays.

Further, such tools will often
 times be used to do designs that require simulations to run as batch
 jobs that run under grid computing tools such as Grid Engine, Condor,
 LSF, etc.

Yes, but you won't have 15,000 students running grid engine.  But even
if you do, you can adopt the services models now prevalent in the
industry.  For example, rather than providing storage for a class, let
Google or Yahoo do it.

 Then, of course, there are files that need to be shared, have reliable
 backups, etc.  Pushing that out to desktop or laptop machines is not
 really a good idea.

Clearly the business of a university has different requirements than
student instruction.  But even then, it seems we're stuck in the 1960s
rather than the 21st century.

I think I might have some home directory somewhere at USC, where I
currently attend, but I'm not really sure.  I know I have a (Sun-based :-)
email account with some sort of quota, but that isn't implemented as a
file system quota.  I keep my stuff in my pocket.  This won't work entirely
for situations like Steve's compute cluster, but it will for many.

There is also a long tail situation here, which is how I approached the
problem at eng.Auburn.edu.  1% of the users will use  90% of the space. For
them, I had special places.  For everyone else, they were lumped into large-ish
buckets.  A daily cron job easily identifies the 1% and we could proactively
redistribute them, as needed.  Of course, quotas are also easily defeated
and the more clever students played a fun game of hide-and-seek, but I
digress.  There is more than one way to solve these allocation problems.

The real PITA was cost accounting, especially for government contracts :-(
The cost of managing the storage is much greater than the cost of the
storage, so the trend will inexorably be towards eliminating the management
costs -- hence the management structure of ZFS is simpler than the previous
solutions.  The main gap for .edu sites is quotas which will likely be solved
some other way in the long run...  Meanwhile, pile on
http://bugs.opensolaris.org/view_bug.do?bug_id=6501037
  -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Darren J Moffat
Richard Elling wrote:
 There is also a long tail situation here, which is how I approached the
 problem at eng.Auburn.edu.  1% of the users will use  90% of the space. For
 them, I had special places.  For everyone else, they were lumped into 
 large-ish
 buckets.  A daily cron job easily identifies the 1% and we could proactively
 redistribute them, as needed.  Of course, quotas are also easily defeated
 and the more clever students played a fun game of hide-and-seek, but I
 digress.  There is more than one way to solve these allocation problems.

Ah I remember those games well and they are one of the reasons I'm now a 
Solaris developer!  Though at Glasgow Uni's Comp Sci department it 
wasn't disk quotas (peer pressure was used for us) but print quotas 
which were much more fun to try and bypass and environmentally 
responsible to quota in the first place.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Wade . Stuart
[EMAIL PROTECTED] wrote on 09/10/2007 11:40:16 AM:

 Richard Elling wrote:
  There is also a long tail situation here, which is how I approached the
  problem at eng.Auburn.edu.  1% of the users will use  90% of the
space. For
  them, I had special places.  For everyone else, they were lumped
 into large-ish
  buckets.  A daily cron job easily identifies the 1% and we could
proactively
  redistribute them, as needed.  Of course, quotas are also easily
defeated
  and the more clever students played a fun game of hide-and-seek, but I
  digress.  There is more than one way to solve these allocation
problems.

 Ah I remember those games well and they are one of the reasons I'm now a
 Solaris developer!  Though at Glasgow Uni's Comp Sci department it
 wasn't disk quotas (peer pressure was used for us) but print quotas
 which were much more fun to try and bypass and environmentally
 responsible to quota in the first place.


  Very true,  you could even pay people to track down heavy users and
bonk them on the head.  Why is everyone responding with alternate routes to
a simple need?  User quotas have been used in the past, and will be used in
the future because they work (well), are simple, tied into many existing
workflows/systems and very understandable for both end users and
administrators.  You can come up with 100 other ways to accomplish psudo
user quotas or end runs around the core issue (did we really have google
space farming suggested -- we are reading a FS mailing list here?), but
quotas are tested and well understood fixes to these problems.  Just
because someone decided to call ZFS pool reservations quotas does not mean
the need for real user quotas is gone.

User quotas are a KISS solution to space hogs.
Zpool quotas (really pool reservations) are not unless you can divvy up
data slices into small fs mounts and have no user overlap in the partition.
user quotas + zfs quotas  zfs quotas;

-Wade

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Darren J Moffat
[EMAIL PROTECTED] wrote:
   Very true,  you could even pay people to track down heavy users and
 bonk them on the head.  Why is everyone responding with alternate routes to
 a simple need? 

For the simple reason that sometimes it is good to challenge existing 
practice and try and find the real need rather than I need X because 
I've always done it using X.

We always used a vfstab and dfstab (or exportfs) file before and used a 
separate software RAID and filesystem before too.

  User quotas have been used in the past, and will be used in
 the future because they work (well), are simple, tied into many existing
 workflows/systems and very understandable for both end users and
 administrators.  You can come up with 100 other ways to accomplish psudo
 user quotas or end runs around the core issue (did we really have google
 space farming suggested -- we are reading a FS mailing list here?), but
 quotas are tested and well understood fixes to these problems.  Just
 because someone decided to call ZFS pool reservations quotas does not mean
 the need for real user quotas is gone.

Reservations in ZFS are quite different to Quotas, ZFS has both 
concepts.  A reservation is a guaranteed minimum, a quota in ZFS is a 
guaranteed maximum.



-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Wade . Stuart


[EMAIL PROTECTED] wrote on 09/10/2007 12:13:18 PM:

 [EMAIL PROTECTED] wrote:
Very true,  you could even pay people to track down heavy users
and
  bonk them on the head.  Why is everyone responding with alternate
routes to
  a simple need?

 For the simple reason that sometimes it is good to challenge existing
 practice and try and find the real need rather than I need X because
 I've always done it using X.

I am not against refactoring solutions,  but zfs quotas and the lack of
user quotas in general either leave people trying to use zfs quotas in lieu
of user quotas, suggesting weak end runs against the problem (a cron to
calculate hogs), or belittling the need to actually limit disk usage per
user id.  All of these threads to this point have not answered the needs in
anyway close to an solution that user quotas allow.





 We always used a vfstab and dfstab (or exportfs) file before and used a
 separate software RAID and filesystem before too.

Yes,  and the replacements (when talking ZFS) are either parity or better
-- that makes switching a win-win.  ENOSUCH when talking user quotas.


   User quotas have been used in the past, and will be used in
  the future because they work (well), are simple, tied into many
existing
  workflows/systems and very understandable for both end users and
  administrators.  You can come up with 100 other ways to accomplish
psudo
  user quotas or end runs around the core issue (did we really have
google
  space farming suggested -- we are reading a FS mailing list here?), but
  quotas are tested and well understood fixes to these problems.  Just
  because someone decided to call ZFS pool reservations quotas does not
mean
  the need for real user quotas is gone.

 Reservations in ZFS are quite different to Quotas, ZFS has both
 concepts.  A reservation is a guaranteed minimum, a quota in ZFS is a
 guaranteed maximum.


Reservations (the general term when talking most of the disk virtualizing
and pooling technologies in play today) usually cover both the floor
(guaranteed space) and ceiling (max alloc space) for the pool volume,
dynamic store, or backing store.  ZFS Quotas (reservations) can be called
whatever you want -- it has just become frustrating when people start
pushing ZFS quotas (reservations) as a drop in replacement for user quotas.
They are tools for different issues with some overlap.  Even though one can
pound in a nail with a screwdriver,  I would rather have a hammer.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Richard Elling
[EMAIL PROTECTED] wrote:
 All of these threads to this point have not answered the needs in
 anyway close to an solution that user quotas allow.

I thought I did answer that... for some definition of answer...

   The main gap for .edu sites is quotas which will likely be solved
  some other way in the long run...  Meanwhile, pile on
  http://bugs.opensolaris.org/view_bug.do?bug_id=6501037

Or, if you're so inclined,
http://cvs.opensolaris.org/source/

The point being that it either isn't a high priority for the ZFS team, there
are other solutions to the problem (which may not require changes to ZFS),
or you can fix it on your own.  You can impact any or all of these things.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-10 Thread Phil Harman

On 10 Sep 2007, at 16:41, Brian H. Nelson wrote:


Stephen Usher wrote:


Brian H. Nelson:

I'm sure it would be interesting for those on the list if you could
outline the gotchas so that the rest of us don't have to re-invent  
the

wheel... or at least not fall down the pitfalls.

Also, here's a link to the ufs on zvol blog where I originally  
found the

idea:

http://blogs.sun.com/scottdickson/entry/fun_with_zvols_-_ufs


Not everything I've seen blogged about UFS and zvols fills me with  
warm fuzzies. For instance, the above takes no account of the fact  
that the UFS filesystem needs to be in a consistent state before a  
snapshot is taken - e.g. using lockfs(1M).


Example:


Preparation ...

basket# zfs create -V 10m pool0/v1
basket# newfs /dev/zvol/rdsk/pool0/v1
newfs: /dev/zvol/rdsk/pool0/v1 last mounted as /tmp/v1
newfs: construct a new file system /dev/zvol/rdsk/pool0/v1: (y/n)? y
Warning: 4130 sector(s) in last cylinder unallocated
/dev/zvol/rdsk/pool0/v1:20446 sectors in 4 cylinders of 48  
tracks, 128 sectors

10.0MB in 1 cyl groups (14 c/g, 42.00MB/g, 20160 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32,

basket# mount -r /dev/zvol/dsk/pool0/v1 /tmp/v1


Scenario 1 ...

basket# date /tmp/v1/f1; zfs snapshot pool0/[EMAIL PROTECTED]
basket# cat /tmp/v1/f1
Mon Sep 10 23:07:42 BST 2007
basket# mount -r /dev/zvol/dsk/pool0/[EMAIL PROTECTED] /tmp/v1s1
basket# ls /tmp/v1s1
f1   lost+found/
basket# cat /tmp/v1s1/f1

basket# date /tmp/v1/f1; zfs snapshot pool0/[EMAIL PROTECTED]
basket# mount -r /dev/zvol/dsk/pool0/[EMAIL PROTECTED] /tmp/v1s2
basket# cat /tmp/v1s2/f1
Mon Sep 10 23:07:42 BST 2007
basket# cat /tmp/v1/f1
Mon Sep 10 23:09:19 BST 2007

Note: the first snapshot sees the file but not the contents, while  
the second snapshot sees stale data.



Scenario 2 ...

basket# date /tmp/v1/f2; lockfs -wf /tmp/v1; zfs snapshot pool0/ 
[EMAIL PROTECTED]; lockfs -u /tmp/v1

basket# mount -r /dev/zvol/dsk/pool0/[EMAIL PROTECTED] /tmp/v1s3
mount: Mount point /tmp/v1s3 does not exist.
basket# mkdir /tmp/v1s3
basket# mount -r /dev/zvol/dsk/pool0/[EMAIL PROTECTED] /tmp/v1s3
basket# cat /tmp/v1s3/f2
Mon Sep 10 23:18:17 BST 2007
basket# cat /tmp/v1/f2
Mon Sep 10 23:18:17 BST 2007
basket#

Note: the snapshot is consistent because of the lockfs(1M) calls.


Phil

smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-09 Thread Stephen Usher
Mike Gerdts wrote:
 On 9/8/07, Richard Elling [EMAIL PROTECTED] wrote:
 Changing the topic slightly, the strategic question is:
 why are you providing disk space to students?
 
 For most programming and productivity (e.g. word processing, etc.)
 people will likely be better suited by having network access for their
 personal equipment with local storage.

Local storage would be a nightmare for secure back-ups. Having said
that, for those using Windows PC and MacOS X we do let them have control
of their machine and store things locally, but it's their own risk. The
central service merely provides a (smallish) home directory which we
guarantee to back-up. Quotas are needed in this case because users can't
be trusted to play fair, especially if they don't realise how bit the
files that they are dragging and dropping are. These machines are also
firewalled to hell and back.

For the rest of the researchers, who have Linux or Solaris machines, we
do not allow them administrative access. All software and home
directories are NFS mounted from the central server so that any machine
a user logs into will give them the same set of tools so that they can
do their work anywhere they need to. Thier home directories need to be
policed by the system because users can't be fully trusted to play fair
and secondly some software will try to cache lots of data in their home
directories without the user knowing.

Now, in our current set-up all these users have a soft limit and a hard
quota. Every night a cron job parses the output of repquota -a and
informs those people who have gone overtheir soft quota and hard quota.
The difference in size between the soft and hard quotas is enough that,
in general, it doesn't affect the user's work and allows them to
remediate the problem before it becomes critical (and important files
suddenly get emptied or the user can't log in).

For large datasets the research groups have their own servers from which
data etc. is available. As said previously, the central allocation of
space is merely enough for day-to-day documents/theses/papers etc.

Oh, and our HPC grid is fully intergrated into this set-up as well. The
idea being a consistant experience throughout the research network.

Steve
--
---
Computer Systems Administrator, E-Mail:[EMAIL PROTECTED]
Department of Earth Sciences,  Tel:-  +44 (0)1865 282110
University of Oxford, Parks Road, Oxford, UK.  Fax:-  +44 (0)1865 272072


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-09 Thread Casper . Dik


Mounts under /net are derived from the filesystems actually shared
from the servers; the automount daemon uses the MOUNT protocol to
determine this.  If you're looking at a path not already seen, the
information will be fresh, but that's where the good news ends.
We don't refresh this information reliably, so if you add a new
share in a directory we've already scanned, you won't see it until
the mounts time out and are removed.  We should refresh this data
more readily and no matter what the source of data.


I know that, yes, but why can't we put such an abstraction elsewhere in
the name space?  One thing I have always disliked about /net mounts is
that they're too magical; it should be possible to replicate them
in some form in other mount maps.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-09 Thread David Magda
On Sep 7, 2007, at 18:25, Stephen Usher wrote:

  (I still have many-many machines on Solaris 8) I can see it
 being at least a decade until all the machines we have being at a  
 level
 to handle NFSv4.

If you need to have a Solaris 8 environment, but want to minimize the  
number of machines you have to manage, the recently announced Project  
Etude may be of some interest to you:

http://blogs.sun.com/dp/entry/project_etude_revealed

It creates a Solaris 8 environment in a Solaris 10 container / zone.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-09 Thread Alec Muffett

 Mounts under /net are derived from the filesystems actually shared
 from the servers; the automount daemon uses the MOUNT protocol to
 determine this.  If you're looking at a path not already seen, the
 information will be fresh, but that's where the good news ends.

 I know that, yes, but why can't we put such an abstraction  
 elsewhere in
 the name space?  One thing I have always disliked about /net mounts is
 that they're too magical; it should be possible to replicate them
 in some form in other mount maps.

In short, you're proposing a solution to the zillions-of-nfs-exports  
issue, which instead of using wait for v4 to implement a server-side  
export consolidation thingy, would instead be a better, smarter / 
net-alike on v2/v3, but give it a sensible name and better namespace  
semantics?

I could go for that...

-a

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Casper . Dik


For NFSv2/v3, there's no easy answers.  Some have experimented
with executable automounter maps that build a list of filesystems
on the fly, but ick.  At some point, some of the global namespace
ideas we kick around may benefit NFSv2/v3 as well.


The question for me is: why does this work for /net mounts (to a point,
of course) and why can't we emulate this for other mount points?

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Richard Elling
Stephen Usher wrote:
 I've just subscribed to this list after Alec's posting and reading the 
 comments in the archive and I have a couple of comments:
   

Welcome Steve,
I think you'll find that we rehash this about every quarter with an
extra kicker just before school starts in the fall.

Changing the topic slightly, the strategic question is:
why are you providing disk space to students?

When you solve this problem, the quota problem is moot.

NB. I managed a large University network for several years, and
am fully aware of the costs involved.  I do not believe that the
1960s timeshare model will survive in such environments.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Mike Gerdts
On 9/8/07, Richard Elling [EMAIL PROTECTED] wrote:
 Changing the topic slightly, the strategic question is:
 why are you providing disk space to students?

For most programming and productivity (e.g. word processing, etc.)
people will likely be better suited by having network access for their
personal equipment with local storage.

For cases when specialized expensive tools ($10k + per seat) are used,
it is not practical to install them on hundreds or thousands of
personal devices for a semester or two of work.  The typical computing
lab that provides such tools is not well equipped to deal with
removable media such as flash drives.  Further, such tools will often
times be used to do designs that require simulations to run as batch
jobs that run under grid computing tools such as Grid Engine, Condor,
LSF, etc.

Then, of course, there are files that need to be shared, have reliable
backups, etc.  Pushing that out to desktop or laptop machines is not
really a good idea.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Stephen Usher
Richard Elling wrote:
 Stephen Usher wrote:
 I've just subscribed to this list after Alec's posting and reading 
 the comments in the archive and I have a couple of comments:
   

 Welcome Steve,
 I think you'll find that we rehash this about every quarter with an
 extra kicker just before school starts in the fall.

 Changing the topic slightly, the strategic question is:
 why are you providing disk space to students?
This is actually the research network, so this is for facalty, 
post-doctural fellows and post-graduate students to do their research 
jobs. The only undergraduates involved are 4th year ones doing research 
projects within the research teams. The space being allocated is the 
basic resource supplied centrally by the Department and for some is the 
only resource that they have as they don't get any money for their own 
computing systems in their grants.
 When you solve this problem, the quota problem is moot.
Not really, not when you have few resources but have to give them out 
fairly.

Steve
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Robert Thurlow
[EMAIL PROTECTED] wrote:
 
 For NFSv2/v3, there's no easy answers.  Some have experimented
 with executable automounter maps that build a list of filesystems
 on the fly, but ick.  At some point, some of the global namespace
 ideas we kick around may benefit NFSv2/v3 as well.
 
 
 The question for me is: why does this work for /net mounts (to a point,
 of course) and why can't we emulate this for other mount points?

Mounts under /net are derived from the filesystems actually shared
from the servers; the automount daemon uses the MOUNT protocol to
determine this.  If you're looking at a path not already seen, the
information will be fresh, but that's where the good news ends.
We don't refresh this information reliably, so if you add a new
share in a directory we've already scanned, you won't see it until
the mounts time out and are removed.  We should refresh this data
more readily and no matter what the source of data.

Rob T
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Robert Thurlow
Alec Muffett wrote:

 But  
 finally, and this is the critical problem, each user's home  
 directory is now a separate NFS share.

 At first look that final point doesn't seem to be much of a worry  
 until you look at the implications that brings. To cope with a  
 distributed system with a large number of users the only managable  
 way of handling NFS mounts is via an automounter. The only  
 alternative would be to have an fstab/vfstab file holding every  
 filesystem any user might want. In the past this has been no  
 problem at all, for all your user home directories on a server you  
 could just export the parent directory holding all the user home  
 directories and put a line users -rw,intr myserver:/disks/users  
 and it would work happily.

 Now, with each user having a separate filesystem this breaks. The  
 automounter will mount the parent filesystem as before but all you  
 will see are the stub directories ready for the ZFS daughter  
 filesystems to mount onto and there's no way of consolidating the  
 ZFS filesystem tree into one NFS share or rules in automount map  
 files to be able to do sub-directory mounting.

Sun's NFS team is close to putting back a fix to the Nevada NFS
client for this where a single mount of the root of a ZFS tree
lets you wander into the daughter filesystems on demand, without
automounter configuration.  You have to be using NFSv4, since it
relies on the server namespace protocol feature.  Some other
NFSv4 clients already do this.  This has always been a part of
the plan to cope with more right-sized filesystems, we've just
not there yet.

For NFSv2/v3, there's no easy answers.  Some have experimented
with executable automounter maps that build a list of filesystems
on the fly, but ick.  At some point, some of the global namespace
ideas we kick around may benefit NFSv2/v3 as well.

Rob T
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Mike Gerdts
On 9/7/07, Alec Muffett [EMAIL PROTECTED] wrote:
  The main bugbear is what the ZFS development team laughably call
  quotas. They aren't quotas, they are merely filesystem size
  restraints. To get around this the developers use the let them eat
  cake mantra, creating filesystems is easy so create a new
  filesystem for each user, with a quota on it. This is the ZFS way.

Having worked in academia and multiple Fortune 100's, the problem
seems to be most prevalent in academia, although possibly a minor
inconvenience in some engineering departments in industry.  In the
.edu where I used to manage the UNIX environment, I would have a tough
time weighing the complexities of quotas he mentions vs. the other
niceties.  My guess is that unless I had something that was really
broken, I would stay with UFS or VxFS waiting for a fix.

It appears as though the author has not yet tried out snapshots.  The
fact that space used by a snapshot for the sysadmin's convenience
counts against the user's quota is the real killer.  This would force
me into a disk to disk (rsync, because zfs send | zfs recv would
require snapshots to stay around for incrementals) backup + snapshot
scenario to be able to keep snapshots while minimizing their impact on
users.  That means double the disk space.  Doubling the quota is not
an option because without soft quotas there is no way to keep people
from using all of their space.  Frankly, that would be so much trouble
I would be better off using tape for restores, just like with UFS or
VxFS.

  Now, with each user having a separate filesystem this breaks. The
  automounter will mount the parent filesystem as before but all you
  will see are the stub directories ready for the ZFS daughter
  filesystems to mount onto and there's no way of consolidating the
  ZFS filesystem tree into one NFS share or rules in automount map
  files to be able to do sub-directory mounting.

While NFS4 holds some promise here, it is not a solution today.  It
won't be until all OS's that came out before 2008 are gone.  That will
be a while.

Use of macros (e.g. * server:/home/) can go a long ways.  If that
doesn't do it, an executable map that does the appropriate munging may
be in order.

  The problem here is one of legacy code, which you'll find
  throughout the academic, and probably commercial world. Basically,
  there's a lot of user generated code which has hard coded paths so
  any new system has to replicate what has gone before. (The current
  system here has automount map entries which map new disks to the
  names of old disks on machines long gone, e.g. /home/eeyore_data/ )

Put such entries before the *  entry and things should be OK.

For me, quotas are likely to be a pain point that prevents me from
making good use of snapshots.  Getting changes in application teams'
understanding and behavior is just too much trouble.  Others are:

1. There seems to be no integration with backup tools that are
time+space+I/O efficient.  If my storage is on Netapp, I can use NDMP
to do incrementals between snapshots.  No such thing exists with ZFS.

2. Use of clones is out because I can't do a space-efficient restore.

3. ARC messes up my knowledge of how much RAM my machine is making
good use of.  After the first backup, vmstat says that I am just at
the brink of not having enough RAM that paging (file system and pager)
will begin soon.  This may be fine on a file server, but it really
messes with me if it is a J2EE server and I'm trying to figure out how
many more app servers I can add.

I have a lot of hopes for ZFS and have used it with success (and
failures) in limited scope.  I'm sure that with time the improvements
will come that make that scope increase dramatically, but for now it
is confined to the lab.  :(

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Nicolas Williams
The complaint is not new, and the problem isn't quotas or lack thereof.

The problem is that remote filesystem clients can't cope with frequent
changes to a server's share list, which is just ZFS's filesystems are
cheap approach promotes.

Basically ZFS was ahead of everyone's implementation of NFSv4 client-
side mount mirroring, which would very much help with the dynamic nature
of ZFS usage.

It does not help that no NFSv3 automounter is sufficiently dynamic to
reasonably cope with filesystems coming and going.

Given the automounter pain this customer would like to have one large
filesystem and quotas.  And that's how quotas are a secondary problem.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread mike
On 9/7/07, Mike Gerdts [EMAIL PROTECTED] wrote:
 For me, quotas are likely to be a pain point that prevents me from
 making good use of snapshots.  Getting changes in application teams'
 understanding and behavior is just too much trouble.  Others are:

not to mention there are smaller-scale users that want the data
protection, checksumming and scalability that ZFS offers (although the
whole zdev/zpool/etc. thing might wind up causing me to have to buy
more disks to add more space, if i were to use it)

it would be nice to have a ZFS lite(tm) for those of us that just want
easily expandable filesystems (as in, add a new disk/device and not
have to think of some larger geometry) with inline
checksumming/COW/metadata/ditto blocks/etc/etc goodness. basically
like a home edition. i don't care about LUNs, send/receive, quotas,
snapshots (for the most part), setting up different zpools to gain
specific performance benefits, etc. i just want raid-z/raid-z2 with a
easy way to add disks.

i have not actually used ZFS yet because i've been waiting for
opensolaris/solaris (or even freebsd possibly) to support eSATA
hardware or something related. the hardware support front for SOHO
users has also been slow. that's not a shortcoming of ZFS though...
but does make me wish i had the basic protection features of ZFS with
hardware support like linux.

- my two cents
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Chris Kirby
Mike Gerdts wrote:
 It appears as though the author has not yet tried out snapshots.  The
 fact that space used by a snapshot for the sysadmin's convenience
 counts against the user's quota is the real killer. 

Very soon there will be another way to specify quotas (and
reservations) such that they only apply to the space used by
the active dataset.

This should make the effect of quotas more obvious to end users
while allowing them to remain blissfully unaware of any snapshot
activity by the sysadmin.

-Chris



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Brian H. Nelson
Mike Gerdts wrote:
 Having worked in academia and multiple Fortune 100's, the problem
 seems to be most prevalent in academia, although possibly a minor
 inconvenience in some engineering departments in industry.  In the
 .edu where I used to manage the UNIX environment, I would have a tough
 time weighing the complexities of quotas he mentions vs. the other
 niceties.  My guess is that unless I had something that was really
 broken, I would stay with UFS or VxFS waiting for a fix.
   

UFS on a zvol is a pretty good compromise. You get lots of the nice ZFS 
stuff (checksums, raidz/z2, snapshots, growable pool, etc) with no 
changes in userland.

There are a couple gotcha's but as long as you're aware of them, it 
works pretty good. We've been using it since January.

-Brian

-- 
---
Brian H. Nelson Youngstown State University
System Administrator   Media and Academic Computing
  bnelson[at]cis.ysu.edu
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Stephen Usher
I've just subscribed to this list after Alec's posting and reading the 
comments in the archive and I have a couple of comments:

Mike Gerdts:

While NFS4 holds some promise here, it is not a solution today.  It
won't be until all OS's that came out before 2008 are gone.  That will
be a while.

Well, seeing as only a few days ago I put the last of our SPARCstation 
1s into the recycle pile and have in daily use a DEC Alphastation (circa 
1996) running Digital UNIX 4.2C, which the new server will need to 
support, and that I've just managed to migrate the last machine off 
Solaris 7 (I still have many-many machines on Solaris 8) I can see it 
being at least a decade until all the machines we have being at a level 
to handle NFSv4.

 From your analysis it does look like UFS is the only way to go 
presantly. However, this is likely to mean that I'm tied to UFS for the 
lifetime of the server, which is probably in the 7-10 year timescale.

Brian H. Nelson:

I'm sure it would be interesting for those on the list if you could 
outline the gotchas so that the rest of us don't have to re-invent the 
wheel... or at least not fall down the pitfalls.


Nicolas Williams:

Unfortunately for us at the coal face it's very rare that we can do the 
ideal thing. Quotas are part of the problem but the main problem is that 
there is currently no way over overcoming the interoperability problems 
using the toolset offered by ZFS.

One way around this for NFSv2/3 clients would be if the ZFS NFS server 
could consolidate a tree of filesystems so that to the clients it 
looks like one filesystem. From the outside the development group this 
seems like the 90% solution which would probably take less engineering 
effort than the full implementation of a user quota system. I'm not sure 
why the OS (outside the ZFS subsystem) would need to know that the 
directory tree it's seeing is composed of separate filesystems and is 
not just one big filesystem. (Unless, of course, there are tape archival 
programs which require to save and recreate ZFS sub-filesystems.) It 
would also have the added benefit of making df(1) usable again. ;-)


Believe me when I say that I'd love to use ZFS and would love to be able 
to recommend it to everyone as, other than this particular set of 
problems, it seems such a great system. My posting on Slashdot was the 
culmination of frustration and disappointment after a number of days 
trying every trick I could think of to get it working and failing.

Steve
--
---
Computer Systems Administrator,E-Mail:[EMAIL PROTECTED]
Department of Earth Sciences, Tel:-  +44 (0)1865 282110
University of Oxford, Parks Road, Oxford, UK. Fax:-  +44 (0)1865 272072

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Nicolas Williams
On Fri, Sep 07, 2007 at 11:25:38PM +0100, Stephen Usher wrote:
 Nicolas Williams:
 
 Unfortunately for us at the coal face it's very rare that we can do the 
 ideal thing. Quotas are part of the problem but the main problem is that 
 there is currently no way over overcoming the interoperability problems 
 using the toolset offered by ZFS.

Understood.  I'll let the ZFS team answer this.

 One way around this for NFSv2/3 clients would be if the ZFS NFS server 
 could consolidate a tree of filesystems so that to the clients it 
 looks like one filesystem. From the outside the development group this 
 seems like the 90% solution which would probably take less engineering 
 effort than the full implementation of a user quota system. I'm not sure 
 why the OS (outside the ZFS subsystem) would need to know that the 
 directory tree it's seeing is composed of separate filesystems and is 
 not just one big filesystem. (Unless, of course, there are tape archival 
 programs which require to save and recreate ZFS sub-filesystems.) It 
 would also have the added benefit of making df(1) usable again. ;-)

Unfortunately there's no way to do this and preserve NFS and POSIX
semantics (those preserved by NFS).  Think of hard links, to name but
one very difficult problem.  Just the task of creating a uniform,
persistent inode number space out of a multitude of distinct filesystems
would be daunting indeed.

That is, there are good technical reasons why what you propose is
non-trivial.

The why the OS ... would need to know that the directory tree it's
seeing is composed of separate filesystems lies in POSIX semantics.
And it's as true on the client side as on the server side.  The problem
you're running into is a limitation of the *client*, not of the server.
The quota support you're asking for is to enable a server-side
workaround for a client-side problem..

 Believe me when I say that I'd love to use ZFS and would love to be able 
 to recommend it to everyone as, other than this particular set of 
 problems, it seems such a great system. My posting on Slashdot was the 
 culmination of frustration and disappointment after a number of days 
 trying every trick I could think of to get it working and failing.

My view (remember, I'm not in the ZFS team) is that ZFS may simply not
be applicable to your use case.  That you may find other use cases where
it is applicable.  If adding quota support is easy, if it's all you need
to workaround the automounter issue and if my opinion mattered, then I'd
say that we should have ZFS quotas.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Mike Gerdts
On 9/7/07, Stephen Usher [EMAIL PROTECTED] wrote:

 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could
 outline the gotchas so that the rest of us don't have to re-invent the
 wheel... or at least not fall down the pitfalls.

The UFS on zvols option sounds intriguing to me, but I would guess
that the following could be problems:

1) Double buffering:  Will ZFS store data in the ARC while UFS uses
traditional file system buffers?

2) Boot order dependencies.  How does the startup of zfs compare to
processing of /etc/vfstab?  I would guess that this is OK due to
legacy mount type supported by zfs.  If this is OK, then dfstab
processing is probably OK.

I say intriguing because it could give you a the improved data
integrity checks and bit more flexibility in how you do things like
backups and restores.  Snapshots of the zvols could be mounted as
other UFS file systems that could allow for self-service restores.
Perhaps this would make it so that you can write data to tape a bit
less frequently.

If deduplication comes into zfs, you may be able to get to a point
where course project instructions that say cp ~course/hugefile ~
become not so expensive - you would be charging quota to each user but
only storing one copy.  Depending on the balance of CPU power vs. I/O
bandwidth, compressed zvols could be a real win, more than paying back
the space required to have a few snapshots around.

Mike
-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Brian Hechinger
On Fri, Sep 07, 2007 at 06:19:34PM -0500, Mike Gerdts wrote:
 
 backups and restores.  Snapshots of the zvols could be mounted as
 other UFS file systems that could allow for self-service restores.
 Perhaps this would make it so that you can write data to tape a bit
 less frequently.

This would be a huge win I think.  We do something similar with our mail
system (NFS mounted to a NetApp).  We quicese all the dbs (bdb essentially)
and execute a snapshot.  Takes mere moments.  Then we backup from the
snapshot.  This allows us to perform a multi-hour backup without having
to take the mailsystem offline at all.

To be able to apply this to other systems, especially ones that wouldn't
even know any better (UFS, NTFS, etc) would certainly be a nice way to go.

In fact, I'll have to try this on the XP box on my desk that mounts iSCSI
zvols the next time I'm in the office. ;)

-brian
-- 
Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly.  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss