Re: [zfs-discuss] ZFS API (again!), need quotactl(7I)

2006-09-12 Thread Darren J Moffat

Jeff A. Earickson wrote:

Hi,

I was looking for the zfs system calls to check zfs quotas from
within C code, analogous to the quotactl(7I) interface for UFS,
and realized that there was nothing similar.  Is anything like this
planned?  Why no public API for ZFS?

Do I start making calls to zfs_prop_get_int(), like in the df
code, to find out what I want?  Will this blow up later?


What is it that you are trying to do here ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: zfs share=.foo-internal.bar.edu on multipleinterfaces?

2006-09-12 Thread Nicolas Dorfsman
 I have a Sun x4200 with 4x gigabit ethernet NICs.  I
 have several of 
 them configured with distinct IP addresses on an
 internal (10.0.0.0) 
 network.

[off topic]
Why are you using distinct IP addresses instead of IPMP ?
[/off]
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Darren J Moffat

Mike Gerdts wrote:

On 9/11/06, Matthew Ahrens [EMAIL PROTECTED] wrote:

B. DESCRIPTION

A new property will be added, 'copies', which specifies how many copies
of the given filesystem will be stored.  Its value must be 1, 2, or 3.
Like other properties (eg.  checksum, compression), it only affects
newly-written data.  As such, it is recommended that the 'copies'
property be set at filesystem-creation time
(eg. 'zfs create -o copies=2 pool/fs').


Is there anything in the works to compress (or encrypt) existing data
after the fact?  For example, a special option to scrub that causes
the data to be re-written with the new properties could potentially do
this.  If so, this feature should subscribe to any generic framework
provided by such an effort.


While encryption of existing data is not in scope for the first ZFS 
crypto phase I am being careful in the design to ensure that it can be 
done later if such a ZFS framework becomes available.


The biggest problem I see with this is one of observability, if not all 
of the data is encrypted yet what should the encryption property say ? 
If it says encryption is on then the admin might think the data is 
safe, but if it says it is off that isn't the truth either because 
some of it maybe in encrypted.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Dick Davies

On 12/09/06, Matthew Ahrens [EMAIL PROTECTED] wrote:

Here is a proposal for a new 'copies' property which would allow
different levels of replication for different filesystems.

Your comments are appreciated!


Flexibility is always nice, but this seems to greatly complicate things,
both technically and conceptually (sometimes, good design is about what
is left out :) ).

Seems to me this lets you say 'files in this directory are x times more
valuable than files elsewhere'. Others have covered some of my
concerns (guarantees, cleanup, etc.). In addition,

* if I move a file somewhere else, does it become less important?
* zpools let you do that already
 (admittedly with less granularity, but *much* *much* more simply -
 and disk is cheap in my world)
* I don't need to do that :)

The only real use I'd see would be for redundant copies
on a single disk, but then why wouldn't I just add a disk?

* disks are cheap, and creating a mirror from a single disk is very easy
 (and conceptually simple)
* *removing* a disk from a mirror pair is simple too - I make mistakes
 sometimes
* in my experience, disks fail. When you get bad errors on part of a disk,
 the disk is about to die.
* you can already create a/several zpools using disk
 partitions as vdevs. That's not all that safe, and I don't see this being
 any safer.


Sorry to be negative, but to me ZFS' simplicity is one of its major features.
I think this provides a cool feature, but I question it's usefulness.

Quite possibly I just don't have the particular itch this is intended
to scratch - is this a much requested feature?


--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Ceri Davies
 Hi Matt,
 Interesting proposal.  Has there been any
 consideration if free space being reported for a ZFS
 filesystem would take into account the copies
  setting?
 
 Example:
 zfs create mypool/nonredundant_data
 zfs create mypool/redundant_data
 df -h /mypool/nonredundant_data
  /mypool/redundant_data 
(shows same amount of free space)
  zfs set copies=3 mypool/redundant_data
 
 Would a new df of /mypool/redundant_data now show a
 different amount of free space (presumably 1/3 if
 different) than /mypool/nonredundant_data?

As I understand the proposal, there's nothing new to do here.  The filesystem 
might be 25% full, and it would be 25% full no matter how many copies of the 
filesystem there are.

Similarly with quotas, I'd argue that the extra copies should not count towards 
a user's quota, since a quota is set on the filesystem.  If I'm using 500M on a 
filesystem, I only have 500M of data no matter how many copies of it the 
administrator has decided to keep (cf. RAID1).

I also don't see why a copy can't just be dropped if the copies value is 
decreased.

Having said this, I don't see any value in the proposal at all, to be honest.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: ZFS + rsync, backup on steroids.

2006-09-12 Thread Bui Minh Truong
Thank you all for your advices.

Finally, I chose the way writing 2 scripts ( client  server) using Port 
forwading via SSH for security reasons.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS + rsync, backup on steroids.

2006-09-12 Thread Boyd Adamson

On 12/09/2006, at 1:28 AM, Nicolas Williams wrote:

On Mon, Sep 11, 2006 at 06:39:28AM -0700, Bui Minh Truong wrote:

Does ssh -v tell you any more ?
I don't think problem is ZFS send/recv. I think it's take a lot of  
time to connect over SSH.
I tried to access SSH by typing: ssh remote_machine. It also takes  
serveral seconds( one or a half second) to connect. Maybe because  
of Solaris SSH.

If you have 100files, it may take : 1000 x 0.5 = 500seconds


You're not doing making an SSH connection for every file though --
you're making an SSH connection for every snapshot.

Now, if you're taking snapshots every second, and each SSH connection
takes on the order of .5 seconds, then you might have a problem.


So that I gave up that solution. I wrote 2 pieces of perl script:
client and server. Their roles are similar to ssh and sshd, then I  
can

connect faster.


But is that secure?


Do you have any suggestions?


Yes.

First, let's see if SSH connection establishment latency is a real
problem.

Second, you could adapt your Perl scripts to work over a persistent  
SSH

connection, e.g., by using SSH port forwarding:

% ssh -N -L 12345:localhost:56789 remote-host

Now you have a persistent SSH connection to remote-host that forwards
connections to localhost:12345 to port 56789 on remote-host.

So now you can use your Perl scripts more securely.


It would be *so* nice if we could get some of the OpenSSH behaviour  
in this area. Recent versions include the ability to open a  
persistent connection and then automatically re-use it for subsequent  
connections to the same host/user.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS API (again!), need quotactl(7I)

2006-09-12 Thread Jeff A. Earickson

On Tue, 12 Sep 2006, Darren J Moffat wrote:


Date: Tue, 12 Sep 2006 10:30:33 +0100
From: Darren J Moffat [EMAIL PROTECTED]
To: Jeff A. Earickson [EMAIL PROTECTED]
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS API (again!), need quotactl(7I)

Jeff A. Earickson wrote:

Hi,

I was looking for the zfs system calls to check zfs quotas from
within C code, analogous to the quotactl(7I) interface for UFS,
and realized that there was nothing similar.  Is anything like this
planned?  Why no public API for ZFS?

Do I start making calls to zfs_prop_get_int(), like in the df
code, to find out what I want?  Will this blow up later?


What is it that you are trying to do here ?


Modify the dovecot IMAP server so that it can get zfs quota information
to be able to implement the QUOTA feature of the IMAP protocol (RFC 2087).
In this case pull the zfs quota numbers for quoted home directory/zfs
filesystem.  Just like what quotactl() would do with UFS.

I am really surprised that there is no zfslib API to query/set zfs
filesystem properties.  Doing a fork/exec just to execute a zfs get
or zfs set is expensive and inelegant.

Jeff Earickson
Colby College
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] ZFS and free space

2006-09-12 Thread Robert Milkowski
Hello Mark,

Monday, September 11, 2006, 4:25:40 PM, you wrote:

MM Jeremy Teo wrote:
 Hello,
 
 how are writes distributed as the free space within a pool reaches a
 very small percentage?
 
 I understand that when free space is available, ZFS will batch writes
 and then issue them in sequential order, maximising write bandwidth.
 When free space reaches a minimum, what happens?
 
 Thanks! :)
 
MM Just what you would expect to happen:

MM As contiguous write space becomes unavailable, writes will be come
MM scattered and performance will degrade.  More importantly: at this
MM point ZFS will begin to heavily write-throttle applications in order
MM to ensure that there is sufficient space on disk for the writes to
MM complete.  This means that there will be less writes to batch up
MM in each transaction group for contiguous IO anyway.

MM As with any file system, performance will tend to degrade at the
MM limits.  ZFS keeps a small overhead reserve (much like other file
MM systems) to help mitigate this, but you will definitely see an
MM impact.

I hope it won't be a problem if space is getting low i a file system
with quota set however in a pool the file system is in there's plenty
of space, right?


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Dick Davies

On 12/09/06, Darren J Moffat [EMAIL PROTECTED] wrote:

Dick Davies wrote:

 The only real use I'd see would be for redundant copies
 on a single disk, but then why wouldn't I just add a disk?

Some systems have physical space for only a single drive - think most
laptops!


True - I'm a laptop user myself. But as I said, I'd assume the whole disk
would fail (it does in my experience).

If your hardware craps differently to mine, you could do a similar thing
with partitions (or even files) as vdevs. Wouldn't be any less reliable.

I'm still not Feeling the Magic on this one :)

--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Bizzare problem with ZFS filesystem

2006-09-12 Thread Anantha N. Srirama
I'm experiencing a bizzare write performance problem while using a ZFS 
filesystem. Here are the relevant facts:

[b]# zpool list[/b]
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
mtdc   3.27T502G   2.78T14%  ONLINE -
zfspool68.5G   30.8G   37.7G44%  ONLINE -

[b]# zfs list[/b]
NAME   USED  AVAIL  REFER  MOUNTPOINT
mtdc   503G  2.73T  24.5K  /mtdc
mtdc/sasmeta   397M   627M   397M  /sasmeta
mtdc/u001 30.5G   226G  30.5G  /u001
mtdc/u002 29.5G   227G  29.5G  /u002
mtdc/u003 29.5G   226G  29.5G  /u003
mtdc/u004 28.4G   228G  28.4G  /u004
mtdc/u005 28.3G   228G  28.3G  /u005
mtdc/u006 29.8G   226G  29.8G  /u006
mtdc/u007 30.1G   226G  30.1G  /u007
mtdc/u008 30.6G   225G  30.6G  /u008
mtdc/u099  266G   502G   266G  /u099
zfspool   30.8G  36.6G  24.5K  /zfspool
zfspool/apps  30.8G  33.2G  28.5G  /apps
zfspool/[EMAIL PROTECTED]  2.28G  -  29.8G  -
zfspool/home  15.4M  2.98G  15.4M  /home

[b]# zfs list mtdc/u099[/b]
NAME PROPERTY   VALUE  SOURCE
mtdc/u099type   filesystem -
mtdc/u099creation   Thu Aug 17 10:21 2006  -
mtdc/u099used   267G   -
mtdc/u099available  501G   -
mtdc/u099referenced 267G   -
mtdc/u099compressratio  3.10x  -
mtdc/u099mountedyes-
mtdc/u099quota  768G   local
mtdc/u099reservationnone   default
mtdc/u099recordsize 128K   default
mtdc/u099mountpoint /u099  local
mtdc/u099sharenfs   offdefault
mtdc/u099checksum   on default
mtdc/u099compressionon local
mtdc/u099atime  offlocal
mtdc/u099deviceson default
mtdc/u099exec   on default
mtdc/u099setuid on default
mtdc/u099readonly   offdefault
mtdc/u099zoned  offdefault
mtdc/u099snapdirhidden default
mtdc/u099aclmodegroupmask  default
mtdc/u099aclinherit secure default

[b]No error messages listed by zpool or /var/opt/messages.[/b] When I try to 
save a file the operation takes an inordinate amount of time, in the 30+ second 
range!!! I truss'd the vi session to see the hangup and it waits at the write 
system call.

# truss -p pid
read(0, 0xFFBFD0AF, 1)  (sleeping...)
read(0,  w, 1)= 1
write(1,  w, 1)   = 1
read(0,  q, 1)= 1
write(1,  q, 1)   = 1
read(0, 0xFFBFD00F, 1)  (sleeping...)
read(0, \r, 1)= 1
ioctl(0, I_STR, 0x000579F8) Err#22 EINVAL
write(1, \r, 1)   = 1
write(1,   d e l e t e m e , 10)= 10
stat64(deleteme, 0xFFBFCFA0)  = 0
creat(deleteme, 0666) = 4
ioctl(2, TCSETSW, 0x00060C10)   = 0
[b]write(4,  l f f j d\n, 6) = 6[/b]  still waiting 
while I type this message!!

This problem manifests itself only on this filesystem and not on the other ZFS 
filesystems on the same server built from the same ZFS pool. While I was 
awaiting completion of the above write I was able to start a new vi session in 
another window and saved a file to the /u001 filesystem without any problem. 
System loads are very low. Can anybody comment on this bizzare behavior?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Jeff Victor
This proposal would benefit greatly by a problem statement.  As it stands, it 
feels like a solution looking for a problem.


The Introduction mentions a different problem and solution, but then pretends that 
there is value to this solution.  The Description section mentions some benefits 
of 'copies' relative to the existing situation, but requires that the reader piece 
together the whole picture.  And IMO there aren't enough pieces :-) , i.e. so far 
I haven't seen sufficient justification for the added administrative complexity 
and potential for confusion, both administrative and user.


Matthew Ahrens wrote:
Here is a proposal for a new 'copies' property which would allow 
different levels of replication for different filesystems.


Your comments are appreciated!

--matt

A. INTRODUCTION

ZFS stores multiple copies of all metadata.  This is accomplished by
storing up to three DVAs (Disk Virtual Addresses) in each block pointer.
This feature is known as Ditto Blocks.  When possible, the copies are
stored on different disks.

See bug 6410698 ZFS metadata needs to be more highly replicated (ditto
blocks) for details on ditto blocks.

This case will extend this feature to allow system administrators to
store multiple copies of user data as well, on a per-filesystem basis.
These copies are in addition to any redundancy provided at the pool
level (mirroring, raid-z, etc).

B. DESCRIPTION

A new property will be added, 'copies', which specifies how many copies
of the given filesystem will be stored.  Its value must be 1, 2, or 3.
Like other properties (eg.  checksum, compression), it only affects
newly-written data.  As such, it is recommended that the 'copies'
property be set at filesystem-creation time
(eg. 'zfs create -o copies=2 pool/fs').

The pool must be at least on-disk version 2 to use this feature (see
'zfs upgrade').

By default (copies=1), only two copies of most filesystem metadata are
stored.  However, if we are storing multiple copies of user data, then 3
copies (the maximum) of filesystem metadata will be stored.

This feature is similar to using mirroring, but differs in several
important ways:

* Different filesystems in the same pool can have different numbers of
   copies.
* The storage configuration is not constrained as it is with mirroring
   (eg. you can have multiple copies even on a single disk).
* Mirroring offers slightly better performance, because only one DVA
   needs to be allocated.
* Mirroring offers slightly better redundancy, because one disk from
   each mirror can fail without data loss.

It is important to note that the copies provided by this feature are in
addition to any redundancy provided by the pool configuration or the
underlying storage.  For example:

* In a pool with 2-way mirrors, a filesystem with copies=1 (the default)
   will be stored with 2 * 1 = 2 copies.  The filesystem can tolerate any
   1 disk failing without data loss.
* In a pool with 2-way mirrors, a filesystem with copies=3
   will be stored with 2 * 3 = 6 copies.  The filesystem can tolerate any
   5 disks failing without data loss (assuming that there are at least
   ncopies=3 mirror groups).
* In a pool with single-parity raid-z a filesystem with copies=2
   will be stored with 2 copies, each copy protected by its own parity
   block.  The filesystem can tolerate any 3 disks failing without data
   loss (assuming that there are at least ncopies=2 raid-z groups).


C. MANPAGE CHANGES
*** zfs.man4Tue Jun 13 10:15:38 2006
--- zfs.man5Mon Sep 11 16:34:37 2006
***
*** 708,714 
--- 708,725 
they are inherited.


+  copies=1 | 2 | 3

+Controls the number of copies of data stored for this dataset.
+These copies are in addition to any redundancy provided by the
+pool (eg. mirroring or raid-z).  The copies will be stored on
+different disks if possible.
+
+Changing this property only affects newly-written data.
+Therefore, it is recommended that this property be set at
+filesystem creation time, using the '-o copies=' option.
+
+
 Temporary Mountpoint Properties
When a file system is mounted, either through mount(1M)  for
legacy  mounts  or  the  zfs mount command for normal file


D. REFERENCES
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
--
Jeff VICTOR  Sun Microsystemsjeff.victor @ sun.com
OS AmbassadorSr. Technical Specialist
Solaris 10 Zones FAQ:http://www.opensolaris.org/os/community/zones/faq
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS + rsync, backup on steroids.

2006-09-12 Thread Nicolas Williams
On Tue, Sep 12, 2006 at 05:57:33PM +1000, Boyd Adamson wrote:
 On 12/09/2006, at 1:28 AM, Nicolas Williams wrote:
 Now you have a persistent SSH connection to remote-host that forwards
 connections to localhost:12345 to port 56789 on remote-host.
 
 So now you can use your Perl scripts more securely.
 
 It would be *so* nice if we could get some of the OpenSSH behaviour  
 in this area. Recent versions include the ability to open a  
 persistent connection and then automatically re-use it for subsequent  
 connections to the same host/user.

There's an RFE for this.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Anton B. Rang
The biggest problem I see with this is one of observability, if not all 
of the data is encrypted yet what should the encryption property say ? 
If it says encryption is on then the admin might think the data is 
safe, but if it says it is off that isn't the truth either because 
some of it maybe still encrypted.

From a user interface perspective, I'd expect something like

  Encryption: Being enabled, 75% complete
or
  Encryption: Being disabled, 25% complete, about 2h23m remaining

I'm not sure how you'd map this into a property (or several), but it seems like 
on/off ought to be paired with transitioning to on/transitioning to off 
for any changes which aren't instantaneous.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Darren J Moffat

Anton B. Rang wrote:
The biggest problem I see with this is one of observability, if not all 
of the data is encrypted yet what should the encryption property say ? 
If it says encryption is on then the admin might think the data is 
safe, but if it says it is off that isn't the truth either because 
some of it maybe still encrypted.



From a user interface perspective, I'd expect something like


  Encryption: Being enabled, 75% complete
or
  Encryption: Being disabled, 25% complete, about 2h23m remaining


and if we are still writing to the file systems at that time ?

Maybe this really does need to be done with the file system locked.


I'm not sure how you'd map this into a property (or several), but it seems like on/off ought to 
be paired with transitioning to on/transitioning to off for any changes which aren't 
instantaneous.


Agreed, and checksum and compression would have the same issue if there 
was a mechanism to rewrite with the new checksums or compression settings.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Anton B. Rang
True - I'm a laptop user myself. But as I said, I'd assume the whole disk
would fail (it does in my experience).

That's usually the case, but single-block failures can occur as well. They're 
rare (check the uncorrectable bit error rate specifications) but if they 
happen to hit a critical file, they're painful.

On the other hand, multiple copies seems (to me) like a really expensive way to 
deal with this. ZFS is already using relatively large blocks, so it could add 
an erasure code on top of them and have far less storage overhead. If the 
assumed problem is multi-block failures in one area of the disk, I'd wonder how 
common this failure mode is; in my experience, multi-block failures are 
generally due to the head having touched the platter, in which case the whole 
drive will shortly fail. (In any case, multi-block failures could be addressed 
by spreading the data from a large block and using an erasure code.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Anton B. Rang
And if we are still writing to the file systems at that time ?

New writes should be done according to the new state (if encryption is being 
enabled, all new writes are encrypted), since the goal is that eventually the 
whole disk will be in the new state.

The completion percentage should probably reflect the existing data at the time 
that the state change is initiated, since new writes won't affect how much data 
has to be replaced.

Maybe this really does need to be done with the file system locked.

I don't see any technical reason to require that, and users expect better from 
us these days.  :-)

As you point out, checksum  compression will have the same issue once we have 
on-line changes for those as well. The framework ought to take care of this.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread David Dyer-Bennet

On 9/11/06, Matthew Ahrens [EMAIL PROTECTED] wrote:

Here is a proposal for a new 'copies' property which would allow
different levels of replication for different filesystems.

Your comments are appreciated!


I've read the proposal, and followed the discussion so far.  I have to
say that I don't see any particular need for this feature.

Possibly there is a need for a different feature, in which the entire
control of redundancy is moved away from the pool level and to the
file or filesystem level.  I definitely see the attraction of being
able to specify by file and directory different degrees of reliability
needed.  However, the details of the feature actually proposed don't
seem to satisfy the need for extra reliability at the level that
drives people to employ redundancy; it doesn't provide a guaranty.

I see no need for additional non-guaranteed reliability on top of the
levels of guaranty provided by use of redundancy at the pool level.

Furthermore, as others have pointed out, this feature would add a high
degree of user-visible complexity.


From what I've seen here so far, I think this is a bad idea and should

not be added.
--
David Dyer-Bennet, mailto:[EMAIL PROTECTED], http://www.dd-b.net/dd-b/
RKBA: http://www.dd-b.net/carry/
Pics: http://www.dd-b.net/dd-b/SnapshotAlbum/
Dragaera/Steven Brust: http://dragaera.info/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] System hang caused by a bad snapshot

2006-09-12 Thread Ben Miller
I had a strange ZFS problem this morning.  The entire system would hang when 
mounting the ZFS filesystems.  After trial and error I determined that the 
problem was with one of the 2500 ZFS filesystems.  When mounting that users' 
home the system would hang and need to be rebooted.  After I removed the 
snapshots (9 of them) for that filesystem everything was fine.

I don't know how to reproduce this and didn't get a crash dump.  I don't 
remember seeing anything about this before so I wanted to report it and see if 
anyone has any ideas.

The system is a Sun Fire 280R with 3GB of RAM running SXCR b40.
The pool looks like this (I'm running a scrub currently):
# zpool status pool1
  pool: pool1
 state: ONLINE
 scrub: scrub in progress, 78.61% done, 0h18m to go
config:

NAME STATE READ WRITE CKSUM
pool1ONLINE   0 0 0
  raidz  ONLINE   0 0 0
c1t8d0   ONLINE   0 0 0
c1t9d0   ONLINE   0 0 0
c1t10d0  ONLINE   0 0 0
c1t11d0  ONLINE   0 0 0

errors: No known data errors

Ben
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Neil A. Wilson

Darren J Moffat wrote:
While encryption of existing data is not in scope for the first ZFS 
crypto phase I am being careful in the design to ensure that it can be 
done later if such a ZFS framework becomes available.


The biggest problem I see with this is one of observability, if not all 
of the data is encrypted yet what should the encryption property say ? 
If it says encryption is on then the admin might think the data is 
safe, but if it says it is off that isn't the truth either because 
some of it maybe in encrypted.


I would also think that there's a significant problem around what to do 
about the previously unencrypted data.  I assume that when performing a 
scrub to encrypt the data, the encrypted data will not be written on 
the same blocks previously used to hold the unencrypted data.  As such, 
there's a very good chance that the unencrypted data would still be 
there for quite some time.  You may not be able to access it through the 
filesystem, but someone with access to the raw disks may be able to 
recover at least parts of it.  In this case, the scrub would not only 
have to write the encrypted data but also overwrite the unencrypted data 
(multiple times?).




Neil
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Darren J Moffat

Neil A. Wilson wrote:

Darren J Moffat wrote:
While encryption of existing data is not in scope for the first ZFS 
crypto phase I am being careful in the design to ensure that it can be 
done later if such a ZFS framework becomes available.


The biggest problem I see with this is one of observability, if not 
all of the data is encrypted yet what should the encryption property 
say ? If it says encryption is on then the admin might think the data 
is safe, but if it says it is off that isn't the truth either 
because some of it maybe in encrypted.


I would also think that there's a significant problem around what to do 
about the previously unencrypted data.  I assume that when performing a 
scrub to encrypt the data, the encrypted data will not be written on 
the same blocks previously used to hold the unencrypted data.  As such, 
there's a very good chance that the unencrypted data would still be 
there for quite some time.  You may not be able to access it through the 
filesystem, but someone with access to the raw disks may be able to 
recover at least parts of it.  In this case, the scrub would not only 
have to write the encrypted data but also overwrite the unencrypted data 
(multiple times?).


Right, that is a very important issue.  Would a ZFS scrub framework do 
copy on write ?  As you point out if it doesn't then we still need to do 
something about the old clear text blocks because strings(1) over the 
raw disk will show them.


I see the desire to have a knob that says make this encrypted now but 
I personally believe that it is actually better if you can make this 
choice at the time you create the ZFS data set.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS API (again!), need quotactl(7I)

2006-09-12 Thread Eric Schrock
On Tue, Sep 12, 2006 at 07:23:00AM -0400, Jeff A. Earickson wrote:
 
 Modify the dovecot IMAP server so that it can get zfs quota information
 to be able to implement the QUOTA feature of the IMAP protocol (RFC 2087).
 In this case pull the zfs quota numbers for quoted home directory/zfs
 filesystem.  Just like what quotactl() would do with UFS.
 
 I am really surprised that there is no zfslib API to query/set zfs
 filesystem properties.  Doing a fork/exec just to execute a zfs get
 or zfs set is expensive and inelegant.

The libzfs API will be made public at some point.  However, we need to
finish implementing the bulk of our planned features before we can feel
comfortable with the interfaces.  It will take a non-trivial amount of
work to clean up all the interfaces as well as document them.  It will
be done eventually, but I wouldn't expect it any time soon - there are
simply too many important things to get done first.

If you don't care about unstable interfaces, you're welcome to use them
as-is.  If you want a stable interface, you are correct that the only
way is through invoking 'zfs get' and 'zfs set'.

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Nicolas Williams
On Tue, Sep 12, 2006 at 10:36:30AM +0100, Darren J Moffat wrote:
 Mike Gerdts wrote:
 Is there anything in the works to compress (or encrypt) existing data
 after the fact?  For example, a special option to scrub that causes
 the data to be re-written with the new properties could potentially do
 this.  If so, this feature should subscribe to any generic framework
 provided by such an effort.
 
 While encryption of existing data is not in scope for the first ZFS 
 crypto phase I am being careful in the design to ensure that it can be 
 done later if such a ZFS framework becomes available.
 
 The biggest problem I see with this is one of observability, if not all 
 of the data is encrypted yet what should the encryption property say ? 
 If it says encryption is on then the admin might think the data is 
 safe, but if it says it is off that isn't the truth either because 
 some of it maybe in encrypted.

I agree -- there needs to be a filesystem re-write option, something
like a scrub but at the filesystem level.  Things that might be
accomplished through it:

 - record size changes
 - compression toggling / compression algorithm changes
 - encryption/re-keying/alg. changes
 - checksum alg. changes
 - ditto blocking

What else?

To me it's important that such scrubs not happen simply as a result of
setting/changing a filesystem property, but it's also important that the
user/admin be told that changing the property requires scrubbing in
order to take effect for data/meta-data written before the change.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] sys_mount problem

2006-09-12 Thread Vladimir Kotal

Hello,

I'm trying to set ZFS to work with RBAC so that I could manage all ZFS
stuff w/out root. However, in my setup there is sys_mount privilege
needed:

- without sys_mount:

vk199839:tessier:~$ zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
local   264G   71.4G193G27%  ONLINE -
vk199839:tessier:~$ profiles 
ZFS Storage Management
ZFS File system Management
Basic Solaris User
All
vk199839:tessier:~$ ppriv $$
317:bash
flags = none
E: basic,dtrace_kernel,dtrace_proc,dtrace_user
I: basic,dtrace_kernel,dtrace_proc,dtrace_user
P: basic,dtrace_kernel,dtrace_proc,dtrace_user
L: all
vk199839:tessier:~$ pfexec zfs create local/testfs
cannot create 'local/testfs': permission denied
vk199839:tessier:~$ pfexec truss zfs create local/testfs

snip

zone_lookup(0x) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x0804679C)  Err#2 ENOENT
ioctl(4, ZFS_IOC_CREATE, 0x0804679C)Err#1 EPERM [sys_mount]
brk(0x080CA000) = 0
fstat64(2, 0x080457C0)  = 0
cannot create 'write(2,  c a n n o t   c r e a t.., 15)   = 15
local/testfswrite(2,  l o c a l / t e s t f s, 12)= 12
': permission deniedwrite(2,  ' :   p e r m i s s i o.., 20)  = 20


- however with sys_mount:

vk199839:tessier:~$ ppriv $$
434:/usr/bin/bash
flags = none
E: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
I: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
P: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
L: all
vk199839:tessier:~$ profiles 
ZFS Storage Management
ZFS File system Management
Basic Solaris User
All
vk199839:tessier:~$ pfexec zfs create local/testfs
vk199839:tessier:~$ echo $?
0
vk199839:tessier:~$ zfs list |grep testfs
local/testfs 9K   191G 9K  /local/testfs
vk199839:sier:~$ ls -ald /local/testfs/
drwxr-xr-x   2 root sys2 Sep 12 19:15 /local/testfs/
vk199839:tessier:~$ ls -ald /local/   
drwxrwxr-x  14 vk199839 sys   16 Sep 12 19:15 /local/

Any idea what is wrong ?

Also, I would like the fs to be created with vk199839:sys and not with
root:sys ownership.


v.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and free space

2006-09-12 Thread Matthew Ahrens

Robert Milkowski wrote:

Hello Mark,

Monday, September 11, 2006, 4:25:40 PM, you wrote:

MM Jeremy Teo wrote:

Hello,

how are writes distributed as the free space within a pool reaches a
very small percentage?

I understand that when free space is available, ZFS will batch writes
and then issue them in sequential order, maximising write bandwidth.
When free space reaches a minimum, what happens?

Thanks! :)


MM Just what you would expect to happen:

MM As contiguous write space becomes unavailable, writes will be come
MM scattered and performance will degrade.  More importantly: at this
MM point ZFS will begin to heavily write-throttle applications in order
MM to ensure that there is sufficient space on disk for the writes to
MM complete.  This means that there will be less writes to batch up
MM in each transaction group for contiguous IO anyway.

MM As with any file system, performance will tend to degrade at the
MM limits.  ZFS keeps a small overhead reserve (much like other file
MM systems) to help mitigate this, but you will definitely see an
MM impact.

I hope it won't be a problem if space is getting low i a file system
with quota set however in a pool the file system is in there's plenty
of space, right?


If you are running close to your quota, there will be a little bit of 
performance degradation, but not to the same degree as when running low 
on free space in the pool.  The reason performance degrades when you're 
near your quota is that we aren't exactly sure how much space will be 
used until we actually get around to writing it out (due to compression, 
snapshots, etc).  So we have to write things out in smaller batches (ie. 
flush out transaction groups more frequently than is optimal).


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread UNIX admin
 This is simply not true. ZFS would protect against
 the same type of 
 errors seen on an individual drive as it would on a
 pool made of HW raid 
 LUN(s). It might be overkill to layer ZFS on top of a
 LUN that is 
 already protected in some way by the devices internal
 RAID code but it 
 does not make your data susceptible to HW errors
 caused by the storage 
 subsystem's RAID algorithm, and slow down the I/O.

I disagree, and vehemently at that. I maintain that if the HW RAID is used, the 
chance of data corruption is much higher, and ZFS would have a lot more 
repairing to do than it would if it were used directly on disks. Problems with 
HW RAID algorithms have been plaguing us for at least 15 years or more. The 
venerable Sun StorEdge T3 comes to mind!

Further, while it is perfectly logical to me that doing RAID calculations twice 
is slower than doing it once, you maintain that is not the case, perhaps 
because one calculation is implemented in FW/HW?

Well, why don't you simply try it out? Once with both RAID HW and ZFS, and once 
with just ZFS directly on the disks?
RAID HW is very likely to have a slower CPU or CPUs than any modern system that 
ZFS will be running on. Even if we assume that the HW RAID's CPU is the same 
speed or faster than the CPU in the server, you still have TWICE the amount of 
work that has to be performed for every write. Once by the hardware and once by 
the software (ZFS). Caches might help some, but I fail to see how double the 
amount of work (and hidden, abstracted complexity) would be as fast or faster 
than just using ZFS directly on the disks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread UNIX admin
 There are also the speed enhancement provided by a HW
 raid array, and 
 usually RAS too,  compared to a native disk drive but
 the numbers on 
 that are still coming in and being analyzed. (See
 previous threads.)

Speed enhancements? What is the baseline of comparison?

Hardware RAIDs can be banalized to two features: cache which does data 
reordering for optimal disk writes and parity calculation which is being 
offloaded off of the server's CPU.

But HW calculations still take time, and the in-between, battery backed cache 
serves to replace the individual disk caches, because of the traditional file 
system approach which had to have some assurance that the data made it to disk 
in one way or another.

With ZFS however the in-between cache is obsolete, as individual disk caches 
can be used directly. I also openly question whether even the dedicated RAID HW 
is faster than the newest CPUs in modern servers.

Unless there is something that I'm missing, I fail to see the benefit of a HW 
RAID in tandem with ZFS. In my view, this holds especially true when one gets 
into SAN storage like SE6920, EMC and Hitachi products.

Furthermore, need I remind of the buggy SE6920 firmware? I don't trust it as 
far as I can throw it.

Or, lets put it this way: I trust Mr. Bonwick a whole lot more than some 
firmware writers.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread Frank Cusack
On September 12, 2006 11:35:54 AM -0700 UNIX admin [EMAIL PROTECTED] 
wrote:

There are also the speed enhancement provided by a HW
raid array, and
usually RAS too,  compared to a native disk drive but
the numbers on
that are still coming in and being analyzed. (See
previous threads.)


It would be nice if you would attribute your quotes.  Maybe this is a
limitation of the web interface?


Speed enhancements? What is the baseline of comparison?

Hardware RAIDs can be banalized to two features: cache which does data
reordering for optimal disk writes and parity calculation which is being
offloaded off of the server's CPU.

But HW calculations still take time, and the in-between, battery backed
cache serves to replace the individual disk caches, because of the
traditional file system approach which had to have some assurance that
the data made it to disk in one way or another.

With ZFS however the in-between cache is obsolete, as individual disk
caches can be used directly. I also openly question whether even the
dedicated RAID HW is faster than the newest CPUs in modern servers.

Unless there is something that I'm missing, I fail to see the benefit of
a HW RAID in tandem with ZFS. In my view, this holds especially true when
one gets into SAN storage like SE6920, EMC and Hitachi products.


I agree with your basic point, that the HW RAID cache is obsoleted by zfs
(which seems to be substantiated here by benchmark results), but I think
you slightly mischaracterize its use.  The speed of the HW RAID CPU is
irrelevant; the parity is XOR which is extremely fast with any CPU when
compared to disk write speed.

What is relevant is, as Anton points out, the CPU cache on the host system.
Parity calculations kill the cache and will hurt memory-intensive apps.
So in this case, offloading it may help in the ufs case.  (Not for zfs,
as I understand from reading here, since checksums still have to be done.
I would argue that this is *absolutely essential* [and zfs obsoletes all
other filesystems] and therefore the gain in the ufs on HW RAID-5 case is
worthless due to the correctness tradeoff.)

It would be interesting to have a zfs enabled HBA to offload the checksum
and parity calculations.  How much of zfs would such an HBA have to
understand?

-frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sys_mount problem

2006-09-12 Thread Mark Shellenbaum

Vladimir Kotal wrote:

Hello,

I'm trying to set ZFS to work with RBAC so that I could manage all ZFS
stuff w/out root. However, in my setup there is sys_mount privilege
needed:

- without sys_mount:



Currently, anything in zfs that changes dataset configurations, such as 
file systems and properties requires sys_mount privilege.  This actually 
comes from the secpolicy_zfs() function if your curious.



ioctl(4, ZFS_IOC_CREATE, 0x0804679C)Err#1 EPERM [sys_mount]
brk(0x080CA000) = 0
fstat64(2, 0x080457C0)  = 0
cannot create 'write(2,  c a n n o t   c r e a t.., 15)   = 15
local/testfswrite(2,  l o c a l / t e s t f s, 12)= 12
': permission deniedwrite(2,  ' :   p e r m i s s i o.., 20)  = 20


- however with sys_mount:

vk199839:tessier:~$ ppriv $$
434:/usr/bin/bash
flags = none
E: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
I: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
P: basic,dtrace_kernel,dtrace_proc,dtrace_user,sys_mount
L: all
vk199839:tessier:~$ profiles 
ZFS Storage Management

ZFS File system Management
Basic Solaris User
All
vk199839:tessier:~$ pfexec zfs create local/testfs
vk199839:tessier:~$ echo $?
0
vk199839:tessier:~$ zfs list |grep testfs
local/testfs 9K   191G 9K  /local/testfs
vk199839:sier:~$ ls -ald /local/testfs/
drwxr-xr-x   2 root sys2 Sep 12 19:15 /local/testfs/
vk199839:tessier:~$ ls -ald /local/   
drwxrwxr-x  14 vk199839 sys   16 Sep 12 19:15 /local/


Any idea what is wrong ?

Also, I would like the fs to be created with vk199839:sys and not with
root:sys ownership.


That will be changed once the delegated administration model is 
integrated.  Once it is integrated a file systems root node will be 
created with the uid/gid of the user that creates the file system.


For more information on this check out the following thread

http://www.opensolaris.org/jive/thread.jspa?threadID=11130tstart=15



  -Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Celso
Take this for what it is: the opinion on someone who knows less about zfs than 
probably anyone else on this thread ,but...

I would like to add my support for this proposal.

As I understand it, the reason for using ditto blocks on metadata, is that 
maintaining their integrity is vital for the health of the filesystem, even if 
the zpool isn't mirrored  or redundant in any way ie laptops, or people who 
just don't or can't add another drive.

One of the great things about zfs, is that it protects not just against 
mechanical failure, but against silent data corruption. Having this available 
to laptop owners seems to me to be important to making zfs even more attractive.

Granted, if you are running a enterprise based fileserver, this probably isn't 
going to be your first choice for data protection. You will probably be using 
the other features of zfs like mirroring, raidz raidz2 etc.

Am I correct in assuming that having say 2 copies of your documents 
filesystem means should silent data corruption occur, your data can be 
reconstructed. So that you can leave your os and base applications with 1 copy, 
but your important data can be protected.

In a way, this reminds me of intel's matrix raid but much cooler (it doesn't 
rely on a specific motherboard for one thing).

I would also agree that utilities like 'ls' and quotas should report both and 
count against peoples quotas. It just doesn't seem to hard to me to understand 
that because you have 2 copies, you halve the amount of available space.

Just to reiterate, I think this would be an awesome feature!

Celso.

PS. Please feel free to correct me on any technical inaccuracies. I am trying 
to learn about zfs and Solaris 10 in general.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Dick Davies

On 12/09/06, Celso [EMAIL PROTECTED] wrote:


One of the great things about zfs, is that it protects not just against 
mechanical failure, but against silent data corruption. Having this available 
to laptop owners seems to me to be important to making zfs even more attractive.


I'm not arguing against that. I was just saying that *if* this was useful to you
(and you were happy with the dubious resilience/performance benefits) you can
already create mirrors/raidz on a single disk by using partitions as
building blocks.
There's no need to implement the proposal to gain that.



Am I correct in assuming that having say 2 copies of your documents 
filesystem means should silent data corruption occur, your data can be reconstructed. So 
that you can leave your os and base applications with 1 copy, but your important data can 
be protected.


Yes.

--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Recommendation ZFS on StorEdge 3320

2006-09-12 Thread Torrey McMahon

UNIX admin wrote:

This is simply not true. ZFS would protect against
the same type of 
errors seen on an individual drive as it would on a
pool made of HW raid 
LUN(s). It might be overkill to layer ZFS on top of a
LUN that is 
already protected in some way by the devices internal
RAID code but it 
does not make your data susceptible to HW errors
caused by the storage 
subsystem's RAID algorithm, and slow down the I/O.



I disagree, and vehemently at that. I maintain that if the HW RAID is used, the 
chance of data corruption is much higher, and ZFS would have a lot more 
repairing to do than it would if it were used directly on disks. Problems with 
HW RAID algorithms have been plaguing us for at least 15 years or more. The 
venerable Sun StorEdge T3 comes to mind!
  



Please expand on your logic. Remember that ZFS works on top of LUNs. A 
disk drive by itself is a LUN when added to a ZFS pool. A LUN can also 
be comprised of multiple disk drives striped together and presented to a 
host as one logical unit. Or a LUN can be offered by a virtualization 
gateway that in turn imports raid array LUNs that are really made up of 
individual disk drives. Or ... insert a million different ways to get a 
host something called a LUN that allows the host to read and write 
blocks. They could be really slow LUNs because they're two hamsters 
shuffling zeros and ones back and forth on little wheels. (OK, that 
might be too slow.) Outside of the cache enabling when entire disk 
drives are presented to the pool ZFS doesn't care what the LUN is made of.


ZFS reliability features are available and work on top of the LUNs you 
give it and the configuration you use. The type of LUN is 
inconsequential at the ZFS level. If I had 12 LUNS that were single disk 
drives and created a RAIDZ pool it would have the same reliability at 
the ZFS level as if I presented it 12 LUNs that were really quad-mirrors 
from 12 independent hw raid array. You can make argument that the 12 
disk drive config is easier to use or that the overall reliability of 
the 12 quad-mirror LUNs system has a higher reliability but at ZFSs 
point of view it's the same. Its happily writing blocks, checking 
checksums, reading things from the LUNs, etc. etc. etc.


On top of that disk drives are not some simple beast that just coughs up 
i/o when you want it to. A modern disk drive does all sorts of stuff 
under the covers to speed up i/o and - surprise - increase the 
reliability of the drive as much as possible. If you think you're really 
writing straight to disk you're not. Cache, ZBR, bad block 
re-allocation, all come into play.


As for problems with specific raid arrays, including the T3, you are 
preaching to the choir but I'm definitely not going to get into a 
pissing contest over specific components having more or less bugs then 
an other.



Further, while it is perfectly logical to me that doing RAID calculations twice 
is slower than doing it once, you maintain that is not the case, perhaps 
because one calculation is implemented in FW/HW?
  


As the man says, It depends. A really fast raid array might be 
responding to i/o requests faster then a single disk drive. It might not 
given the nature of the i/o coming in.


Don't think of it in terms of RAID calculations taking a certain amount 
of time. Think of it in terms of having to meet a specific amount of 
requirements to manage your data. I'll be the first to say that if 
you're going to be putting ZFS on a desktop then a simple JBOD is a box 
to look at. If you're going to look at an enterprise data center the 
answer is going to be different. That is something a lot of people on 
this alias seem to be missing out on. Stating ZFS on JBODs is the answer 
to everything is the punchline of the When all you have is a hammer... 
routine.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Celso
 On 12/09/06, Celso [EMAIL PROTECTED] wrote:
 
  One of the great things about zfs, is that it
 protects not just against mechanical failure, but
 against silent data corruption. Having this available
 to laptop owners seems to me to be important to
 making zfs even more attractive.
 
 I'm not arguing against that. I was just saying that
 *if* this was useful to you
 (and you were happy with the dubious
 resilience/performance benefits) you can
 already create mirrors/raidz on a single disk by
 using partitions as
 building blocks.
 There's no need to implement the proposal to gain
 that.
 
 


It's not as granular though is it?

In the situation you  describe:

...you split one disk in two. you then have effectively two partitions which 
you can then create a new mirrored zpool with. Then everything is mirrored. 
Correct?

With ditto blocks, you can selectively add copies (seeing as how filesystem are 
so easy to create on zfs). If you are only concerned with copies of your 
important documents and email, why should /usr/bin be mirrored.

That's my opinion anyway. I always enjoy choice, and I really believe this is a 
useful and flexible one.

Celso
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] marvel cards.. as recommended

2006-09-12 Thread Joe Little

So, people here recommended the Marvell cards, and one even provided a
link to acquire them for SATA jbod support. Well, this is what the
latest bits (B47) say:

Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx0: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx1: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx0: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx1: Could not attach, unsupported chip stepping or unable to
get the chip stepping

Any takers on how to get around this one?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Memory Usage

2006-09-12 Thread Mark Maybee

Thomas Burns wrote:

Hi,

We have been using zfs for a couple of months now, and, overall, really
like it.  However, we have run into a major problem -- zfs's memory  
requirements
crowd out our primary application.  Ultimately, we have to reboot the  
machine

so there is enough free memory to start the application.

What I would like is:

1) A way to limit the size of the cache (a gig or two would be fine  for 
us)


2) A way to clear the caches -- hopefully, something faster than  rebooting
the machine.

Is there any way I can do either of these things?

Thanks,
Tom Burns


Tom,

What version of solaris are you running?  In theory, ZFS should not
be hogging your system memory to the point that it crowds out your
primary applications... but this is still an area that we are working
out the kinks in.  If you could provide a core dump of the machine
when it gets to the point that you can't start your app, it would
help us.

As to your questions; I will give you some ways to do these things,
but these are not considered best practice:

1) You should be able to limit your cache max size by setting
arc.c_max.  Its currently initialized to be phys-mem-size - 1GB.

2) First try unmount/remounting your file system to clear the
cache.  If that doesn't work, try exporting/importing your pool.

-Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Memory Usage

2006-09-12 Thread Al Hopper
On Tue, 12 Sep 2006, Mark Maybee wrote:

 Thomas Burns wrote:
  Hi,
 
  We have been using zfs for a couple of months now, and, overall, really
  like it.  However, we have run into a major problem -- zfs's memory
  requirements
  crowd out our primary application.  Ultimately, we have to reboot the
  machine
  so there is enough free memory to start the application.
 
  What I would like is:
 
  1) A way to limit the size of the cache (a gig or two would be fine  for
  us)
 
  2) A way to clear the caches -- hopefully, something faster than  rebooting
  the machine.
 
  Is there any way I can do either of these things?
 
  Thanks,
  Tom Burns

 Tom,

 What version of solaris are you running?  In theory, ZFS should not
 be hogging your system memory to the point that it crowds out your
 primary applications... but this is still an area that we are working
 out the kinks in.  If you could provide a core dump of the machine
 when it gets to the point that you can't start your app, it would
 help us.

 As to your questions; I will give you some ways to do these things,
 but these are not considered best practice:

 1) You should be able to limit your cache max size by setting
 arc.c_max.  Its currently initialized to be phys-mem-size - 1GB.

 2) First try unmount/remounting your file system to clear the
 cache.  If that doesn't work, try exporting/importing your pool.

Another nasty and risky workaround is to start making copies of a large
file in /tmp while watching your available swap space carefully.  When you
hit the low memory water mark, ZFS will free up a snitload (technical term
(TM)) of memory.  Then immediately rm all the files you created in /tmp.
You don't want to completely exhaust memory or you'll probably loose the
system.

Remember my first line: nasty and risky.

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] marvel cards.. as recommended

2006-09-12 Thread James C. McPherson

Joe Little wrote:

So, people here recommended the Marvell cards, and one even provided a
link to acquire them for SATA jbod support. Well, this is what the
latest bits (B47) say:

Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx0: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx1: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx0: Could not attach, unsupported chip stepping or unable to
get the chip stepping
Sep 12 13:51:54 vram marvell88sx: [ID 679681 kern.warning] WARNING:
marvell88sx1: Could not attach, unsupported chip stepping or unable to
get the chip stepping

Any takers on how to get around this one?


You could start by providing the output from prtpicl -v and
prtconf -v as well as /usr/X11/bin/scanpci -v -V 1 so we
know which device you're actually having a problem with.

Is the pci vendor+deviceid for that card listed in your
/etc/driver_aliases file against the marvell88sx driver?


James

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Memory Usage

2006-09-12 Thread Thomas Burns


On Sep 12, 2006, at 2:04 PM, Mark Maybee wrote:


Thomas Burns wrote:

Hi,
We have been using zfs for a couple of months now, and, overall,  
really
like it.  However, we have run into a major problem -- zfs's  
memory  requirements
crowd out our primary application.  Ultimately, we have to reboot  
the  machine

so there is enough free memory to start the application.
What I would like is:
1) A way to limit the size of the cache (a gig or two would be  
fine  for us)
2) A way to clear the caches -- hopefully, something faster than   
rebooting

the machine.
Is there any way I can do either of these things?
Thanks,
Tom Burns


Tom,

What version of solaris are you running?  In theory, ZFS should not
be hogging your system memory to the point that it crowds out your
primary applications... but this is still an area that we are working
out the kinks in.  If you could provide a core dump of the machine
when it gets to the point that you can't start your app, it would
help us.


We are running the jun 06 version of solaris (10/6?).  I don't have a  
core
dump now -- but can probably get one in the next week or so.  Where  
should

I send it?

Also, where do I set arc.c_max?  In etc/system?  Out of curiosity,  
why isn't
limiting arc.c_max considered best practice (I just want to make sure  
I am
not missing something about the effect limiting it will have)?  My  
guess is
that in our case (lots of small groups -- 50 people or less --  
sharing files
over the web)  that file system caches are not that useful.  The  
small groups
mean that no one file gets used that often and, since access is over  
the web,
their response time will be largely limited by their internet  
connection.


Thanks a lot for the response!



As to your questions; I will give you some ways to do these things,
but these are not considered best practice:

1) You should be able to limit your cache max size by setting
arc.c_max.  Its currently initialized to be phys-mem-size - 1GB.

2) First try unmount/remounting your file system to clear the
cache.  If that doesn't work, try exporting/importing your pool.

-Mark


Tom Burns



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to NOT mount a ZFS storage pool/ZFS file system?

2006-09-12 Thread David Smith
I currently have a system which has two ZFS storage pools.  One of the pools is 
coming from a faulty piece of hardware.  I would like to bring up our server 
mounting the storage pool which is okay and NOT mounting the one with from the 
hardware with problems.   Is there a simple way to NOT mount one of my ZFS 
storage pools?

The system is currently down due to the disk issues from one of the above 
pools.  

Thanks,

David
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to NOT mount a ZFS storage pool/ZFS file system?

2006-09-12 Thread Frank Cusack

zfs export

On September 12, 2006 2:41:27 PM -0700 David Smith [EMAIL PROTECTED] wrote:

I currently have a system which has two ZFS storage pools.  One of the pools is 
coming from a
faulty piece of hardware.  I would like to bring up our server mounting the 
storage pool which is
okay and NOT mounting the one with from the hardware with problems.   Is there 
a simple way to
NOT mount one of my ZFS storage pools?

The system is currently down due to the disk issues from one of the above pools.

Thanks,

David


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Memory Usage

2006-09-12 Thread Mark Maybee

Thomas Burns wrote:


On Sep 12, 2006, at 2:04 PM, Mark Maybee wrote:


Thomas Burns wrote:


Hi,
We have been using zfs for a couple of months now, and, overall,  really
like it.  However, we have run into a major problem -- zfs's  memory  
requirements
crowd out our primary application.  Ultimately, we have to reboot  
the  machine

so there is enough free memory to start the application.
What I would like is:
1) A way to limit the size of the cache (a gig or two would be  fine  
for us)
2) A way to clear the caches -- hopefully, something faster than   
rebooting

the machine.
Is there any way I can do either of these things?
Thanks,
Tom Burns



Tom,

What version of solaris are you running?  In theory, ZFS should not
be hogging your system memory to the point that it crowds out your
primary applications... but this is still an area that we are working
out the kinks in.  If you could provide a core dump of the machine
when it gets to the point that you can't start your app, it would
help us.



We are running the jun 06 version of solaris (10/6?).  I don't have a  core
dump now -- but can probably get one in the next week or so.  Where  should
I send it?


You can drop cores via ftp to:

   sunsolve.sun.com
   login as anonymous or ftp
   deposit into /cores

Also, where do I set arc.c_max?  In etc/system?  Out of curiosity,  why 
isn't

limiting arc.c_max considered best practice (I just want to make sure  I am
not missing something about the effect limiting it will have)?  My  
guess is
that in our case (lots of small groups -- 50 people or less --  sharing 
files
over the web)  that file system caches are not that useful.  The  small 
groups
mean that no one file gets used that often and, since access is over  
the web,

their response time will be largely limited by their internet  connection.



We don't want users to need to tune a bunch of knobs to get performance
out of ZFS.  We want it to work well out of the box.  So we are trying
to discourage using these tunables, and instead figure out what the root
problem is and fix it.  There is really no reason why zfs shouldn't be
able to adapt itself appropriately to the available memory.


Thanks a lot for the response!



As to your questions; I will give you some ways to do these things,
but these are not considered best practice:

1) You should be able to limit your cache max size by setting
arc.c_max.  Its currently initialized to be phys-mem-size - 1GB.

2) First try unmount/remounting your file system to clear the
cache.  If that doesn't work, try exporting/importing your pool.

-Mark



Tom Burns



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Memory Usage

2006-09-12 Thread johansen
 1) You should be able to limit your cache max size by
 setting arc.c_max.  Its currently initialized to be
 phys-mem-size - 1GB.

Mark's assertion that this is not a best practice is something of an 
understatement.  ZFS was designed so that users/administrators wouldn't have to 
configure tunables to achieve optimal system performance.  ZFS performance is 
still a work in progress.

The problem with adjusting arc.c_max is that its definition may change from one 
release to another.  It's an internal kernel variable, its existence isn't 
guaranteed.  There are also no guarantees about the semantics of what a future 
arc.c_max might mean.  It's possible that future implementations may change the 
definition such that reducing c_max has other unintended consequences.

Unfortunately, at the present time this is probably the only way to limit the 
cache size.  Mark and I are working on strategies to make sure that ZFS is a 
better citizen when it comes to memory usage and performance.  Mark has 
recently made a number of changes which should help ZFS reduce its memory 
footprint.  However, until these changes and others make it into a production 
build we're going to have to live with this inadvisable approach for adjusting 
the cache size.

-j
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Dick Davies

On 12/09/06, Celso [EMAIL PROTECTED] wrote:


...you split one disk in two. you then have effectively two partitions which 
you can then create a new mirrored zpool with. Then everything is mirrored. 
Correct?


Everything in the filesystems in the pool, yes.


With ditto blocks, you can selectively add copies (seeing as how filesystem are 
so easy to create on zfs). If you are only concerned with copies of your 
important documents and email, why should /usr/bin be mirrored.


So my machine will boot if a disk fails. Which happened the other day :)

--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Memory Usage

2006-09-12 Thread Thomas Burns
Also, where do I set arc.c_max?  In etc/system?  Out of  
curiosity,  why isn't
limiting arc.c_max considered best practice (I just want to make  
sure  I am
not missing something about the effect limiting it will have)?   
My  guess is
that in our case (lots of small groups -- 50 people or less --   
sharing files
over the web)  that file system caches are not that useful.  The   
small groups
mean that no one file gets used that often and, since access is  
over  the web,
their response time will be largely limited by their internet   
connection.


We don't want users to need to tune a bunch of knobs to get  
performance
out of ZFS.  We want it to work well out of the box.  So we are  
trying
to discourage using these tunables, and instead figure out what the  
root

problem is and fix it.  There is really no reason why zfs shouldn't be
able to adapt itself appropriately to the available memory.


Ah, the ZFS philosophy that I love (not have to tune a bunch of knobs)!

Seems like you need a way for the kernal to say I would like some  
memory
back now.  I don't have the slightest idea how practical that is  
though...


BTW -- did I guess right wrt where I need to set arc.c_max (etc/system)?

Thanks,
Tom

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Celso
 On 12/09/06, Celso [EMAIL PROTECTED] wrote:
 
  ...you split one disk in two. you then have
 effectively two partitions which you can then create
 a new mirrored zpool with. Then everything is
 mirrored. Correct?
 
 Everything in the filesystems in the pool, yes.
 
  With ditto blocks, you can selectively add copies
 (seeing as how filesystem are so easy to create on
 zfs). If you are only concerned with copies of your
 important documents and email, why should /usr/bin be
 mirrored.
 
 So my machine will boot if a disk fails. Which
 happened the other day :)
 
 -- 
 Rasputin :: Jack of All Trades - Master of Nuns
 http://number9.hellooperator.net/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 
ok cool.

I think it has already been said that in many peoples experience, when a disk 
fails, it completely fails. Especially on laptops. Of course ditto blocks 
wouldn't help you in this situation either!

I still think that silent data corruption is a valid concern, one that ditto 
blocks would solve. Also, I am not thrilled about losing that much space for 
duplication of unneccessary data (caused by partitioning a disk in two).

I also echo Darren's comments on zfs performing better when it has the whole 
disk.

Hopefully we can agree that you lose nothing by adding this feature, even if 
you personally don't see a need for it.

Celso
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Torrey McMahon

Celso wrote:


Hopefully we can agree that you lose nothing by adding this feature, even if 
you personally don't see a need for it.



If I read correctly user tools will show more space in use when adding 
copies, quotas are impacted, etc. One could argue the added confusion 
outweighs the addition of the feature.


As others have asked I'd like to see the problem that this feature is 
designed to solve.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Matthew Ahrens

Matthew Ahrens wrote:
Here is a proposal for a new 'copies' property which would allow 
different levels of replication for different filesystems.


Thanks everyone for your input.

The problem that this feature attempts to address is when you have some 
data that is more important (and thus needs a higher level of 
redundancy) than other data.  Of course in some situations you can use 
multiple pools, but that is antithetical to ZFS's pooled storage model. 
 (You have to divide up your storage, you'll end up with stranded 
storage and bandwidth, etc.)


Given the overwhelming criticism of this feature, I'm going to shelve it 
for now.


Out of curiosity, what would you guys think about addressing this same 
problem by having the option to store some filesystems unreplicated on 
an mirrored (or raid-z) pool?  This would have the same issues of 
unexpected space usage, but since it would be *less* than expected, that 
might be more acceptable.  There are no plans to implement anything like 
this right now, but I just wanted to get a read on it.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user data

2006-09-12 Thread Celso
 Matthew Ahrens wrote:
  Here is a proposal for a new 'copies' property
 which would allow 
  different levels of replication for different
 filesystems.
 
 Thanks everyone for your input.
 
 The problem that this feature attempts to address is
 when you have some 
 data that is more important (and thus needs a higher
 level of 
 redundancy) than other data.  Of course in some
 situations you can use 
 multiple pools, but that is antithetical to ZFS's
 pooled storage model. 
 (You have to divide up your storage, you'll end up
  with stranded 
 torage and bandwidth, etc.)
 
 Given the overwhelming criticism of this feature, I'm
 going to shelve it 
 for now.


Damn! That's a real shame! I was really starting to look forward to that. 
Please reconsider??!


 --matt
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 

Celso
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread David Dyer-Bennet

On 9/12/06, Matthew Ahrens [EMAIL PROTECTED] wrote:

Matthew Ahrens wrote:
 Here is a proposal for a new 'copies' property which would allow
 different levels of replication for different filesystems.

Thanks everyone for your input.

The problem that this feature attempts to address is when you have some
data that is more important (and thus needs a higher level of
redundancy) than other data.  Of course in some situations you can use
multiple pools, but that is antithetical to ZFS's pooled storage model.
  (You have to divide up your storage, you'll end up with stranded
storage and bandwidth, etc.)

Given the overwhelming criticism of this feature, I'm going to shelve it
for now.


I think it's a valid problem.  My understanding was that this didn't
give a *guaranteed* solution, though.  I think most people, when
committing to the point of replication (spending actual money), need a
guarantee at some level (not of course of total safety; but that the
data actually does exist on separate disks, and will survive the
destruction of one disk).  A good solution to this problem would be
valuable.  (And I'd accept a non-guarantee on a single disk; or rather
a guarantee that said if enough blocks to find the data exist, and a
copy of each data block exists, we can retrieve the data; but that
guarantee *does* exist I think).


Out of curiosity, what would you guys think about addressing this same
problem by having the option to store some filesystems unreplicated on
an mirrored (or raid-z) pool?  This would have the same issues of
unexpected space usage, but since it would be *less* than expected, that
might be more acceptable.  There are no plans to implement anything like
this right now, but I just wanted to get a read on it.


I was never concerned at the free space issues (though I was concerned
by some of the proposed solutions to what I saw as a non-issue).  I'd
be happy if the free space described how many bytes of default files
you could add to the pool, and the user would have to understand that
results would differ if they used non-default parameters.  You're
probably right that fewer people would mind having *more* space than
an unthinking reading would show than less.
--
David Dyer-Bennet, mailto:[EMAIL PROTECTED], http://www.dd-b.net/dd-b/
RKBA: http://www.dd-b.net/carry/
Pics: http://www.dd-b.net/dd-b/SnapshotAlbum/
Dragaera/Steven Brust: http://dragaera.info/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Dick Davies

On 12/09/06, Celso [EMAIL PROTECTED] wrote:


I think it has already been said that in many peoples experience, when a disk 
fails, it completely fails. Especially on laptops. Of course ditto blocks 
wouldn't help you in this situation either!


Exactly.


I still think that silent data corruption is a valid concern, one that ditto 
blocks would solve.  Also, I am not thrilled about losing that much space for 
duplication of unneccessary data (caused by partitioning a disk in two).


Well, you'd only be duplicating the data on the mirror. If you don't want to
mirror the base OS, no one's saying you have to.

For the sake of argument, let's assume:

1. disk is expensive
2. someone is keeping valuable files on a non-redundant zpool
3. they can't scrape enough vdevs to make a redundant zpool
   (remembering you can build vdevs out of *flat files*)

Even then, to my mind:

to the user, the *file* (screenplay, movie of childs birth, civ3 saved
game, etc.)
is the logical entity to have a 'duplication level' attached to it,
and the only person who can score that is the author of the file.

This proposal says the filesystem creator/admin scores the filesystem.
Your argument against unneccessary data duplication applies to all 'non-special'
files in the 'special' filesystem. They're wasting space too.

If the user wants to make sure the file is 'safer' than others, he can
just make
multiple copies. Either to a USB disk/flashdrive, cdrw, dvd, ftp
server, whatever.

The redundancy you're talking about is what you'd get
from 'cp /foo/bar.jpg /foo/bar.jpg.ok', except it's hidden from the
user and causing
headaches for anyone trying to comprehend, port or extend the codebase in
the future.


I also echo Darren's comments on zfs performing better when it has the whole 
disk.


Me too, but a lot of laptop users dual-boot, which makes it a moot point.


Hopefully we can agree that you lose nothing by adding this feature,
even if you personally don't see a need for it.


Sorry, I don't think we're going to agree on this one :)

I've seen dozens of project proposals in the few months I've been lurking
around opensolaris. Most of them have been of no use to me, but
each to their own.

I'm afraid I honestly think this greatly complicates the conceptual model
(not to mention the technical implementation) of ZFS, and I haven't seen
a convincing use case.

All the best
Dick.

--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread Neil A. Wilson

Matthew Ahrens wrote:

Matthew Ahrens wrote:
Here is a proposal for a new 'copies' property which would allow 
different levels of replication for different filesystems.


Thanks everyone for your input.

The problem that this feature attempts to address is when you have some 
data that is more important (and thus needs a higher level of 
redundancy) than other data.  Of course in some situations you can use 
multiple pools, but that is antithetical to ZFS's pooled storage model. 
 (You have to divide up your storage, you'll end up with stranded 
storage and bandwidth, etc.)


Given the overwhelming criticism of this feature, I'm going to shelve it 
for now.


This is unfortunate.  As a laptop user with only a single drive, I was 
looking forward to it since I've been bitten in the past by data loss 
caused by a bad area on the disk.  I don't care about the space 
consumption because I generally don't come anywhere close to filling up 
the available space.  It may not be the primary market for ZFS, but it 
could be a very useful side benefit.




Out of curiosity, what would you guys think about addressing this same 
problem by having the option to store some filesystems unreplicated on 
an mirrored (or raid-z) pool?  This would have the same issues of 
unexpected space usage, but since it would be *less* than expected, that 
might be more acceptable.  There are no plans to implement anything like 
this right now, but I just wanted to get a read on it.


I don't see much need for this in any area that I would use ZFS (either 
my own personal use or for any case in which I would recommend it for 
production use).


However, if you think that it's OK to under-report free space, then why 
not just do that for the data ditto blocks.  If one or more of my 
filesystems are configured to keep two copies of the data, then simply 
report only half of the available space.  If duplication isn't enabled 
for the entire pool but only for certain filesystems, then perhaps you 
could even take advantage of quotas for those filesystems to make a more 
accurate calculation.




--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Matthew Ahrens

Dick Davies wrote:

For the sake of argument, let's assume:

1. disk is expensive
2. someone is keeping valuable files on a non-redundant zpool
3. they can't scrape enough vdevs to make a redundant zpool
   (remembering you can build vdevs out of *flat files*)


Given those assumptions, I think that the proposed feature is the 
perfect solution.  Simply put those files in a filesystem that has copies1.


Also note that using files to back vdevs is not a recommended solution.


If the user wants to make sure the file is 'safer' than others, he
can just make multiple copies. Either to a USB disk/flashdrive, cdrw,
dvd, ftp server, whatever.


It seems to me that asking the user to solve this problem by manually 
making copies of all his files puts all the burden on the 
user/administrator and is a poor solution.


For one, they have to remember to do it pretty often.  For two, when 
they do experience some data loss, they have to manually reconstruct the 
files!  They could have one file which has part of it missing from copy 
A and part of it missing from copy B.  I'd hate to have to reconstruct 
that manually from two different files, but the proposed solution would 
do this transparently.



The redundancy you're talking about is what you'd get from 'cp
/foo/bar.jpg /foo/bar.jpg.ok', except it's hidden from the user and
causing headaches for anyone trying to comprehend, port or extend the
codebase in the future.


Whether it's hard to understand is debatable, but this feature 
integrates very smoothly with the existing infrastructure and wouldn't 
cause any trouble when extending or porting ZFS.



I'm afraid I honestly think this greatly complicates the conceptual model
(not to mention the technical implementation) of ZFS, and I haven't seen
a convincing use case.


Just for the record, these changes are pretty trivial to implement; less 
than 50 lines of code changed.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Celso
 On 12/09/06, Celso [EMAIL PROTECTED] wrote:
 
  I think it has already been said that in many
 peoples experience, when a disk fails, it completely
 fails. Especially on laptops. Of course ditto blocks
 wouldn't help you in this situation either!
 
 Exactly.
 
  I still think that silent data corruption is a
 valid concern, one that ditto blocks would solve. 
 Also, I am not thrilled about losing that much space
 for duplication of unneccessary data (caused by
 partitioning a disk in two).
 
 Well, you'd only be duplicating the data on the
 mirror. If you don't want to
 mirror the base OS, no one's saying you have to.
 

Yikes! that sounds like even more partitioning!

 For the sake of argument, let's assume:
 
 1. disk is expensive
 2. someone is keeping valuable files on a
 non-redundant zpool
 3. they can't scrape enough vdevs to make a redundant
 zpool
 (remembering you can build vdevs out of *flat
  files*)
 Even then, to my mind:
 
 to the user, the *file* (screenplay, movie of childs
 birth, civ3 saved
 game, etc.)
 is the logical entity to have a 'duplication level'
 attached to it,
 and the only person who can score that is the author
 of the file.
 
 This proposal says the filesystem creator/admin
 scores the filesystem.
 Your argument against unneccessary data duplication
 applies to all 'non-special'
 files in the 'special' filesystem. They're wasting
 space too.
 
 If the user wants to make sure the file is 'safer'
 than others, he can
 just make
 multiple copies. Either to a USB disk/flashdrive,
 cdrw, dvd, ftp
 server, whatever.
 
 The redundancy you're talking about is what you'd get
 from 'cp /foo/bar.jpg /foo/bar.jpg.ok', except it's
 hidden from the
 user and causing
 headaches for anyone trying to comprehend, port or
 extend the codebase in
 the future.

the proposed solution differs in one important aspect: it automatically detects 
data corruption.


  I also echo Darren's comments on zfs performing
 better when it has the whole disk.
 
 Me too, but a lot of laptop users dual-boot, which
 makes it a moot point.
 
  Hopefully we can agree that you lose nothing by
 adding this feature,
  even if you personally don't see a need for it.
 
 Sorry, I don't think we're going to agree on this one
 :)


No worries, that's cool. 
 All the best
 Dick.
 
 -- 
 Rasputin :: Jack of All Trades - Master of Nuns
 http://number9.hellooperator.net/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 

Celso
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Chad Lewis


On Sep 12, 2006, at 4:39 PM, Celso wrote:


On 12/09/06, Celso [EMAIL PROTECTED] wrote:


I think it has already been said that in many

peoples experience, when a disk fails, it completely
fails. Especially on laptops. Of course ditto blocks
wouldn't help you in this situation either!

Exactly.


I still think that silent data corruption is a

valid concern, one that ditto blocks would solve. 
Also, I am not thrilled about losing that much space
for duplication of unneccessary data (caused by
partitioning a disk in two).

Well, you'd only be duplicating the data on the
mirror. If you don't want to
mirror the base OS, no one's saying you have to.



Yikes! that sounds like even more partitioning!



The redundancy you're talking about is what you'd get
from 'cp /foo/bar.jpg /foo/bar.jpg.ok', except it's
hidden from the
user and causing
headaches for anyone trying to comprehend, port or
extend the codebase in
the future.


the proposed solution differs in one important aspect: it  
automatically detects data corruption.





Detecting data corruption is a function of the ZFS checksumming  
feature. The proposed solution
has _nothing_ to do with detecting corruption. The difference is in  
what happens when/if such
bad data is detected. Without a duplicate copy, via some RAID level  
or the proposed ditto block

copies, the file is corrupted.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread eric kustarz

Matthew Ahrens wrote:


Matthew Ahrens wrote:

Here is a proposal for a new 'copies' property which would allow 
different levels of replication for different filesystems.



Thanks everyone for your input.

The problem that this feature attempts to address is when you have 
some data that is more important (and thus needs a higher level of 
redundancy) than other data.  Of course in some situations you can use 
multiple pools, but that is antithetical to ZFS's pooled storage 
model.  (You have to divide up your storage, you'll end up with 
stranded storage and bandwidth, etc.)


Given the overwhelming criticism of this feature, I'm going to shelve 
it for now.



So it seems to me that having this feature per-file is really useful.  
Say i have a presentation to give in Pleasanton, and the presentation 
lives on my single-disk laptop - I want all the meta-data and the actual 
presentation to be replicated.  We already use ditto blocks for the 
meta-data.  Now we could have an extra copy of the actual data.  When i 
get back from the presentation i can turn off the extra copies.


Doing it for the filesystem is just one step higher (and makes it 
administratively easier as i don't have to type the same command for 
each file thats important).


Mirroring is just like another step above that - though its possibly 
replicating stuff you just don't care about.


Now placing extra copies of the data doesn't guarantee that data will 
survive multiple diskf failures; but neither does having a mirrored pool 
guarantee the data will be there either (2 disk failures).  Both methods 
are about increasing your chances of having your valuable data around.


I for one would have loved to have multiple copy filesystems + ZFS on my 
powerbook when i was travelling in Australia for a month - think of all 
the digital pictures you take and how pissed you would be if the one 
with the wild wombat didn't survive.


Its maybe not an enterprise solution, but it seems like a consumer solution.

Ensuring that the space accounting tools make sense is definitely a 
valid point though.


eric



Out of curiosity, what would you guys think about addressing this same 
problem by having the option to store some filesystems unreplicated on 
an mirrored (or raid-z) pool?  This would have the same issues of 
unexpected space usage, but since it would be *less* than expected, 
that might be more acceptable.  There are no plans to implement 
anything like this right now, but I just wanted to get a read on it.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Jeff Victor

Chad Lewis wrote:


On Sep 12, 2006, at 4:39 PM, Celso wrote:


the proposed solution differs in one important aspect: it automatically
detects data corruption.


Detecting data corruption is a function of the ZFS checksumming feature. The
proposed solution has _nothing_ to do with detecting corruption. The difference
is in what happens when/if such bad data is detected. Without a duplicate copy,
via some RAID level  or the proposed ditto block copies, the file is corrupted.



With a mirrored ZFS pool, what are the odds of losing all copies of the 
[meta]data, for N disks (where N = 1, 2, etc)?   I thought we understood this 
pretty well, and that the answer was extremely small.


--
Jeff VICTOR  Sun Microsystemsjeff.victor @ sun.com
OS AmbassadorSr. Technical Specialist
Solaris 10 Zones FAQ:http://www.opensolaris.org/os/community/zones/faq
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Bizzare problem with ZFS filesystem

2006-09-12 Thread Anantha N. Srirama
Here's the information you requested.

Script started on Tue Sep 12 16:46:46 2006
# uname -a
SunOS umt1a-bio-srv2 5.10 Generic_118833-18 sun4u sparc SUNW,Netra-T12
# prtdiag
System Configuration: Sun Microsystems  sun4u Sun Fire E2900
System clock frequency: 150 MHZ
Memory size: 96GB   

=== CPUs 
===
   E$  CPU  CPU
CPU  Freq  SizeImplementation   MaskStatus  Location
---    --  ---  -   --  
  0,512  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB0/P0
  1,513  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB0/P1
  2,514  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB0/P2
  3,515  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB0/P3
  8,520  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB2/P0
  9,521  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB2/P1
 10,522  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB2/P2
 11,523  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB2/P3
 16,528  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB4/P0
 17,529  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB4/P1
 18,530  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB4/P2
 19,531  1500 MHz  32MBSUNW,UltraSPARC-IV+   2.1on-line SB4/P3

# md    mdb -k
(B)0Loading modules: [ unix krtld genunix dtrace specfs ufs sd sgsbbc md 
sgenv ip sctp usba fcp fctl qlc nca ssd lofs zfs random crypto ptm nfs ipc 
logindmux cpc sppp fcip wrsmd ]
 arc::stat    print
{
anon = ARC_anon
mru = ARC_mru
mru_ghost = ARC_mru_ghost
mfu = ARC_mfu
mfu_ghost = ARC_mfu_ghost
size = 0x11917e1200
p = 0x116e8a1a40
c = 0x11917cf428
c_min = 0xbf77c800
c_max = 0x17aef9
hits = 0x489737a8
misses = 0x8869917
deleted = 0xc832650
skipped = 0x15b29b2
hash_elements = 0x1273d0
hash_elements_max = 0x17576f
hash_collisions = 0x4e0ceee
hash_chains = 0x3a9b2
Segmentation Fault - core dumped
# mdb -k
(B)0Loading modules: [ unix krtld genunix dtrace specfs ufs sd sgsbbc md 
sgenv ip sctp usba fcp fctl qlc nca ssd lofs zfs random crypto ptm nfs ipc 
logindmux cpc sppp fcip wrsmd ]
 ::kmastat
 ::pgrep vi | ::walk thread
3086600f660
 : 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598d91
[ 02a104598d91 cv_wait_sig+0x114() ]
  02a104598e41 str_cv_wait+0x28()
  02a104598f01 strwaitq+0x238()
  02a104598fc1 strread+0x174()
  02a1045990a1 fop_read+0x20()
  02a104599161 read+0x274()
  02a1045992e1 syscall_trap32+0xcc()
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
  02a104598f61 zil_lwb_commit+0x1ac()
  02a104599011 zil_commit+0x1b0()
  02a1045990c1 zfs_fsync+0xa8()
  02a104599171 fop_fsync+0x14()
  02a104599231 fdsync+0x20()
  02a1045992e1 syscall_trap32+0xcc()
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598c71
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
  02a104598f61 zil_lwb_commit+0x1ac()
  02a104599011 zil_commit+0x1b0()
  02a1045990c1 zfs_fsync+0xa8()
  02a104599171 fop_fsync+0x14()
  02a104599231 fdsync+0x20()
  02a1045992e1 syscall_trap32+0xcc()
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
  02a104598f61 zil_lwb_commit+0x1ac()
  02a104599011 zil_commit+0x1b0()
  02a1045990c1 zfs_fsync+0xa8()
  02a104599171 fop_fsync+0x14()
  02a104599231 fdsync+0x20()
  02a1045992e1 syscall_trap32+0xcc()
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598bb1
 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61
  02a104598f61 zil_lwb_commit+0x1ac()
  02a104599011 zil_commit+0x1b0()
  02a1045990c1 zfs_fsync+0xa8()
  02a104599171 fop_fsync+0x14()
  02a104599231 fdsync+0x20()
  02a1045992e1 syscall_trap32+0xcc()
 3086600f660::findstack
stack pointer for thread 3086600f660 (TS_FREE): 2a104598ba1
  02a104598fe1 segvn_unmap+0x1b8()
  02a1045990d1 as_free+0xf4()
  02a104599181 proc_exit+0x46c()
  02a104599231 exit+8()
  02a1045992e1 syscall_trap32+0xcc()
# df -h
Filesystem size   used  avail capacity  Mounted on
/dev/md/dsk/d10 32G   6.7G25G22%/
/devices 0K 0K 0K 0%/devices
ctfs 0K 0K 0K 0%/system/contract
proc 0K 0K 0K 0%/proc
mnttab   0K 0K 0K 0%/etc/mnttab
swap  

Re: [zfs-discuss] Proposal: multiple copies of user data

2006-09-12 Thread David Dyer-Bennet

On 9/12/06, eric kustarz [EMAIL PROTECTED] wrote:


So it seems to me that having this feature per-file is really useful.
Say i have a presentation to give in Pleasanton, and the presentation
lives on my single-disk laptop - I want all the meta-data and the actual
presentation to be replicated.  We already use ditto blocks for the
meta-data.  Now we could have an extra copy of the actual data.  When i
get back from the presentation i can turn off the extra copies.


Yes, you could do that.

*I* would make a copy on a CD, which I would carry in a separate case
from the laptop.

I think my presentation is a lot safer than your presentation.

Similarly for your digital images example; I don't consider it safe
until I have two or more *independent* copies.  Two copies on a single
hard drive doesn't come even close to passing the test for me; as many
people have pointed out, those tend to fail all at once.  And I will
also point out that laptops get stolen a lot.  And of course all the
accidents involving fumble-fingers, OS bugs, and driver bugs won't be
helped by the data duplication either.  (Those will mostly be helped
by sensible use of snapshots, though, which is another argument for
ZFS on *any* disk you work on a lot.)

The more I look at it the more I think that a second copy on the same
disk doesn't protect against very much real-world risk.  Am I wrong
here?  Are partial(small) disk corruptions more common than I think?
I don't have a good statistical view of disk failures.
--
David Dyer-Bennet, mailto:[EMAIL PROTECTED], http://www.dd-b.net/dd-b/
RKBA: http://www.dd-b.net/carry/
Pics: http://www.dd-b.net/dd-b/SnapshotAlbum/
Dragaera/Steven Brust: http://dragaera.info/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread David Dyer-Bennet

On 9/12/06, Celso [EMAIL PROTECTED] wrote:


 Whether it's hard to understand is debatable, but
 this feature
 integrates very smoothly with the existing
 infrastructure and wouldn't
 cause any trouble when extending or porting ZFS.


OK, given this statement...


 Just for the record, these changes are pretty trivial
 to implement; less
 than 50 lines of code changed.

and this statement, I can't see any reasons not to include it. If the changes 
are easy to do, don't require anymore of the zfs team's valuable time, and 
don't hinder other things, I would plead with you to include them, as I think 
they are genuinely valuable and would make zfs not only the best enterprise 
level filesystem, but also the best filesystem for laptops/home computers.


While I'm not a big fan of this feature, if the work is that well
understood and that small, I have no objection to it.  (Boy that
sounds snotty; apologies, not what I intend here.  Those of you
reading this know how muich you care about my opinion, that's up to
you.)

I do pity the people who count on the ZFS redundancy to protect their
presentation on an important sales trip -- and then have their laptop
stolen.  But those people might well be the same ones who would have
*no* redundancy otherwise.  And nothing about this feature prevents
the paranoids like me from still making our backup CD and carrying it
separately.

I'm not prepared to go so far as to argue that it's bad to make them
feel safer :-).  At least, to make them feel safer *by making them
actually safer*.
--
David Dyer-Bennet, mailto:[EMAIL PROTECTED], http://www.dd-b.net/dd-b/
RKBA: http://www.dd-b.net/carry/
Pics: http://www.dd-b.net/dd-b/SnapshotAlbum/
Dragaera/Steven Brust: http://dragaera.info/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Torrey McMahon

David Dyer-Bennet wrote:


While I'm not a big fan of this feature, if the work is that well
understood and that small, I have no objection to it.  (Boy that
sounds snotty; apologies, not what I intend here.  Those of you
reading this know how muich you care about my opinion, that's up to
you.)


One could make the argument that the feature could cause enough 
confusion to not warrant its inclusion. If I'm a typical user and I 
write a file to the filesystem where the admin set three copies but 
didn't tell me it might throw me into a tizzy trying to figure out why 
my quota is 3X where I expect it to be.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss