Eddie Edwards wrote:
> In principle, I think you're right, in that an objset could hold both
> versions of a file.  It would contain physical pointers to each copy.
> They would start out the same and diverge when one or other copy was
> modified.
> 
> For instance, zfs could support a fast "cp" which just cloned the file
> i.e. copied the objset entry to a new objset entry.  This would allow you
> to copy a 100GB file in milliseconds, and still have the ability to update
> both copies of the file independently.  This is similar in principle to
> UNIX fork().
> 
> But I'm not aware of any user-level interfaces that would allow this to
> happen.

That's because multiple pointers to the same block in the same dataset would 
make it Extremely Nontrivial to determine when to free a block.  For a 
related problem, see RFE 6343653[*] "want to quickly "copy" a file from a 
snapshot".

fork() has an additional level of indirection (the page_t) and it takes 
O(address space) time because it has to bump the reference count on each page.

--matt

[*] oh, you can't see the evaluation.  I'll reproduce it here:

In addition to simply copying the bp's, we will need to remove any
copied bp's from their dead lists.  We don't know exactly which deadlist
they are in, so we'll have to search all deadlists after the snapshot we
are copying from.

One way to architect this would be for the zpl to set up the destination
object, then call a "dmu_splice" routine which would copy a range of one
object to another.  Note that if the old object still exists, you *must*
copy to that one, otherwise the head objset could have the same bp
stored in many objects (very bad).  The dmu could store an in-core list
of bp's to remove from deadlist ("reincarnate"?), and process it in
syncing context (by creating an avl tree and traversing the deadlists
once).

*** (#1 of 2): 2005-10-30 04:59:00 GMT+00:00    matthew.ahrens at sun.com
*** Last Edit: 2005-10-30 04:59:00 GMT+00:00    matthew.ahrens at sun.com

Some design notes...

sh: zfs recover <from-.snapshot-path> <to-fs-path>
sh: zfs recover @<snapname> <to-fs-path>
zpl: remove old version of file[*].
      if not recovering into same <obj,gen>:
         create new file
         set file metadata (perms, size, etc) to snap file's
         add snap's <obj,gen> -> new <obj,gen> to recovered-map
dmu: copy bps from snapshot to head obj (use db_overridden_by?)
dmu: put bps in reincarnate avl-tree (per txg)
dsl: when sync, walk each deadlist, most recent first.
      if find a bp in the reincarnate avl, remove bp from deadlist
      (decrease unique if necessary) (move last entry here), and remove
      bp from avl when avl empty, stop.  assert that avl becomes empty.

[*] to remove the old version of the file:
         set tocheck <obj,gen> = snap's <obj,gen>
         if tocheck <obj,gen> is in recovered-map
                 set tocheck <obj,gen> = value <obj,gen>
         if tocheck <obj,gen> exists in head fs {
                 if it is the specified destination file, truncate it
                 otherwise, fail "you must remove filename"
         }
         remove tocheck <obj,gen> from recovered-map

*** (#2 of 2): 2006-05-25 19:03:05 GMT+00:00    matthew.ahrens at sun.com
*** Last Edit: 2006-05-25 19:03:05 GMT+00:00    matthew.ahrens at sun.com

Reply via email to