This proposal would benefit greatly by a "problem statement." As it stands, it feels like a solution looking for a problem.

The Introduction mentions a different problem and solution, but then pretends that there is value to this solution. The Description section mentions some benefits of 'copies' relative to the existing situation, but requires that the reader piece together the whole picture. And IMO there aren't enough pieces :-) , i.e. so far I haven't seen sufficient justification for the added administrative complexity and potential for confusion, both administrative and user.

Matthew Ahrens wrote:
Here is a proposal for a new 'copies' property which would allow different levels of replication for different filesystems.

Your comments are appreciated!

--matt

A. INTRODUCTION

ZFS stores multiple copies of all metadata.  This is accomplished by
storing up to three DVAs (Disk Virtual Addresses) in each block pointer.
This feature is known as "Ditto Blocks".  When possible, the copies are
stored on different disks.

See bug 6410698 "ZFS metadata needs to be more highly replicated (ditto
blocks)" for details on ditto blocks.

This case will extend this feature to allow system administrators to
store multiple copies of user data as well, on a per-filesystem basis.
These copies are in addition to any redundancy provided at the pool
level (mirroring, raid-z, etc).

B. DESCRIPTION

A new property will be added, 'copies', which specifies how many copies
of the given filesystem will be stored.  Its value must be 1, 2, or 3.
Like other properties (eg.  checksum, compression), it only affects
newly-written data.  As such, it is recommended that the 'copies'
property be set at filesystem-creation time
(eg. 'zfs create -o copies=2 pool/fs').

The pool must be at least on-disk version 2 to use this feature (see
'zfs upgrade').

By default (copies=1), only two copies of most filesystem metadata are
stored.  However, if we are storing multiple copies of user data, then 3
copies (the maximum) of filesystem metadata will be stored.

This feature is similar to using mirroring, but differs in several
important ways:

* Different filesystems in the same pool can have different numbers of
   copies.
* The storage configuration is not constrained as it is with mirroring
   (eg. you can have multiple copies even on a single disk).
* Mirroring offers slightly better performance, because only one DVA
   needs to be allocated.
* Mirroring offers slightly better redundancy, because one disk from
   each mirror can fail without data loss.

It is important to note that the copies provided by this feature are in
addition to any redundancy provided by the pool configuration or the
underlying storage.  For example:

* In a pool with 2-way mirrors, a filesystem with copies=1 (the default)
   will be stored with 2 * 1 = 2 copies.  The filesystem can tolerate any
   1 disk failing without data loss.
* In a pool with 2-way mirrors, a filesystem with copies=3
   will be stored with 2 * 3 = 6 copies.  The filesystem can tolerate any
   5 disks failing without data loss (assuming that there are at least
   ncopies=3 mirror groups).
* In a pool with single-parity raid-z a filesystem with copies=2
   will be stored with 2 copies, each copy protected by its own parity
   block.  The filesystem can tolerate any 3 disks failing without data
   loss (assuming that there are at least ncopies=2 raid-z groups).


C. MANPAGE CHANGES
*** zfs.man4    Tue Jun 13 10:15:38 2006
--- zfs.man5    Mon Sep 11 16:34:37 2006
***************
*** 708,714 ****
--- 708,725 ----
            they are inherited.


+      copies=1 | 2 | 3

+        Controls the number of copies of data stored for this dataset.
+        These copies are in addition to any redundancy provided by the
+        pool (eg. mirroring or raid-z).  The copies will be stored on
+        different disks if possible.
+
+        Changing this property only affects newly-written data.
+        Therefore, it is recommended that this property be set at
+        filesystem creation time, using the '-o copies=' option.
+
+
     Temporary Mountpoint Properties
        When a file system is mounted, either through mount(1M)  for
        legacy  mounts  or  the  "zfs mount" command for normal file


D. REFERENCES
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
--------------------------------------------------------------------------
Jeff VICTOR              Sun Microsystems            jeff.victor @ sun.com
OS Ambassador            Sr. Technical Specialist
Solaris 10 Zones FAQ:    http://www.opensolaris.org/os/community/zones/faq
--------------------------------------------------------------------------
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to