[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-10 Thread Peter Eriksson
At the lunch today we started talking about a feature that would have been nice 
to have - a system call sort of similar to link(2) where you would get a cloned 
copy of a file that would (initially) share the same data blocks on the disk 
but would use copy-on-write to create private copies as soon as something is 
modified.

It could be nice to use for example for a mail server using maildir:s so that 
the mail delivery program sending a mail to multiple users could use that 
syscall instead of writing N copies of the same mail...
(It would save space until the users would start to modifiy the files). 

Anyway, just a brainstorm idea that came up... :-)
--
This messages posted from opensolaris.org



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Robert Milkowski
Hello Peter,

Wednesday, October 10, 2007, 1:59:20 PM, you wrote:

PE> At the lunch today we started talking about a feature that would
PE> have been nice to have - a system call sort of similar to link(2)
PE> where you would get a cloned copy of a file that would (initially)
PE> share the same data blocks on the disk but would use copy-on-write
PE> to create private copies as soon as something is modified.

PE> It could be nice to use for example for a mail server using
PE> maildir:s so that the mail delivery program sending a mail to
PE> multiple users could use that syscall instead of writing N copies of the 
same mail...
PE> (It would save space until the users would start to modifiy the files).

While I totally second the idea (it has been discussed on zfs-discuss
some time ago) in case of email platform it wouldn't be that easy as
every file is different (different headers at least). But there are
definitely other uses where it would be useful and improve user
experience. I haven't looked into details but in theory one should be
able to copy/move a file within the same datapool between datasets
without having to actually copy data blocks... or maybe there's some
detail which actually makes it hard to implement...

-- 
Best regards,
 Robert Milkowski  mailto:rmilkowski at task.gda.pl
   http://milek.blogspot.com




[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Matthew Ahrens
Robert Milkowski wrote:
> I haven't looked into details but in theory one should be
> able to copy/move a file within the same datapool between datasets
> without having to actually copy data blocks... or maybe there's some
> detail which actually makes it hard to implement...

Once a block is referenced by multiple filesystem, it is nontrivial to 
determine when it can be freed.

--matt



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Robert Milkowski
Hello Matthew,

Thursday, October 11, 2007, 9:10:13 AM, you wrote:

MA> Robert Milkowski wrote:
>> I haven't looked into details but in theory one should be
>> able to copy/move a file within the same datapool between datasets
>> without having to actually copy data blocks... or maybe there's some
>> detail which actually makes it hard to implement...

MA> Once a block is referenced by multiple filesystem, it is nontrivial to
MA> determine when it can be freed.

In a way multiple snapshots are separate file systems, or clones...
What's the difference? However I'm sure you right...

-- 
Best regards,
 Robert Milkowski  mailto:rmilkowski at task.gda.pl
   http://milek.blogspot.com




[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Pawel Jakub Dawidek
On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
> Hello Matthew,
> 
> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
> 
> MA> Robert Milkowski wrote:
> >> I haven't looked into details but in theory one should be
> >> able to copy/move a file within the same datapool between datasets
> >> without having to actually copy data blocks... or maybe there's some
> >> detail which actually makes it hard to implement...
> 
> MA> Once a block is referenced by multiple filesystem, it is nontrivial to
> MA> determine when it can be freed.
> 
> In a way multiple snapshots are separate file systems, or clones...
> What's the difference? However I'm sure you right...

Snapshot and clones are not autonomous datasets. A clone has always a
parent, you can use 'zfs promote' to switch the relatioship, but you
cannot make them independent, AFAIK.

To Matthew: As I understand it, Robert was talking more about moving the
blocks to another dataset, not creating a hardlink-like situation - only
one dataset will reference the blocks after the move.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
pjd at FreeBSD.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: 



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Robert Milkowski
Hello Pawel,

Thursday, October 11, 2007, 11:27:07 AM, you wrote:

PJD> On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
>> Hello Matthew,
>> 
>> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
>> 
>> MA> Robert Milkowski wrote:
>> >> I haven't looked into details but in theory one should be
>> >> able to copy/move a file within the same datapool between datasets
>> >> without having to actually copy data blocks... or maybe there's some
>> >> detail which actually makes it hard to implement...
>> 
>> MA> Once a block is referenced by multiple filesystem, it is nontrivial to
>> MA> determine when it can be freed.
>> 
>> In a way multiple snapshots are separate file systems, or clones...
>> What's the difference? However I'm sure you right...

PJD> Snapshot and clones are not autonomous datasets. A clone has always a
PJD> parent, you can use 'zfs promote' to switch the relatioship, but you
PJD> cannot make them independent, AFAIK.

PJD> To Matthew: As I understand it, Robert was talking more about moving the
PJD> blocks to another dataset, not creating a hardlink-like situation - only
PJD> one dataset will reference the blocks after the move.

Yep, with move that's what I had in mind.
I've also was talking about zfscopy...


-- 
Best regards,
 Robertmailto:rmilkowski at task.gda.pl
   http://milek.blogspot.com




[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Matthew Ahrens
Pawel Jakub Dawidek wrote:
> On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
>> Hello Matthew,
>>
>> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
>>
>> MA> Robert Milkowski wrote:
 I haven't looked into details but in theory one should be
 able to copy/move a file within the same datapool between datasets
 without having to actually copy data blocks... or maybe there's some
 detail which actually makes it hard to implement...
>> MA> Once a block is referenced by multiple filesystem, it is nontrivial to
>> MA> determine when it can be freed.
>>
>> In a way multiple snapshots are separate file systems, or clones...
>> What's the difference? However I'm sure you right...

Well, snapshots are nontrivial too.  See

http://blogs.sun.com/ahrens/entry/is_it_magic

> Snapshot and clones are not autonomous datasets. A clone has always a
> parent, you can use 'zfs promote' to switch the relatioship, but you
> cannot make them independent, AFAIK.
> 
> To Matthew: As I understand it, Robert was talking more about moving the
> blocks to another dataset, not creating a hardlink-like situation - only
> one dataset will reference the blocks after the move.

Well, he said "copy/move".  "copy" implied to me that both filesystems would 
reference the same blocks.  And even if it is just "move", you still have the 
issue of snapshots from the original filesystem referencing it.  Changing the 
snapshots so they no longer reference the file?  Also nontrivial.

--matt




[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-11 Thread Pawel Jakub Dawidek
On Thu, Oct 11, 2007 at 09:49:51AM -0700, Matthew Ahrens wrote:
> Pawel Jakub Dawidek wrote:
> > On Thu, Oct 11, 2007 at 10:47:44AM +0100, Robert Milkowski wrote:
> >> Hello Matthew,
> >>
> >> Thursday, October 11, 2007, 9:10:13 AM, you wrote:
> >>
> >> MA> Robert Milkowski wrote:
>  I haven't looked into details but in theory one should be
>  able to copy/move a file within the same datapool between datasets
>  without having to actually copy data blocks... or maybe there's some
>  detail which actually makes it hard to implement...
> >> MA> Once a block is referenced by multiple filesystem, it is nontrivial to
> >> MA> determine when it can be freed.
> >>
> >> In a way multiple snapshots are separate file systems, or clones...
> >> What's the difference? However I'm sure you right...
> 
> Well, snapshots are nontrivial too.  See
> 
> http://blogs.sun.com/ahrens/entry/is_it_magic
> 
> > Snapshot and clones are not autonomous datasets. A clone has always a
> > parent, you can use 'zfs promote' to switch the relatioship, but you
> > cannot make them independent, AFAIK.
> > 
> > To Matthew: As I understand it, Robert was talking more about moving the
> > blocks to another dataset, not creating a hardlink-like situation - only
> > one dataset will reference the blocks after the move.
> 
> Well, he said "copy/move".  "copy" implied to me that both filesystems would 
> reference the same blocks.  And even if it is just "move", you still have the 
> issue of snapshots from the original filesystem referencing it.  Changing the 
> snapshots so they no longer reference the file?  Also nontrivial.

I'm sorry for trying to be too helpful:)

I understand it's not trivial, but beeing able to reference the same
block from different datasets would be a really nice feature to have.
The functionality discussed above if only one example. Another example
would be block aggregation (which has its own name I can't recall right
now), so we can run a thread once a day that frees duplicated blocks and
make datasets to point at one copy only.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
pjd at FreeBSD.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: 



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-12 Thread Darren J Moffat
Pawel Jakub Dawidek wrote:
> I understand it's not trivial, but beeing able to reference the same
> block from different datasets would be a really nice feature to have.
> The functionality discussed above if only one example. Another example
> would be block aggregation (which has its own name I can't recall right
> now), so we can run a thread once a day that frees duplicated blocks and
> make datasets to point at one copy only.

NTFS can do this within a single filesystem.

It feels to me almost like the opposite of ditto blocks :-) Though I 
would still want ditto blocks to work with this.

-- 
Darren J Moffat



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-12 Thread Pawel Jakub Dawidek
On Fri, Oct 12, 2007 at 11:10:31AM +0100, Darren J Moffat wrote:
> Pawel Jakub Dawidek wrote:
> > I understand it's not trivial, but beeing able to reference the same
> > block from different datasets would be a really nice feature to have.
> > The functionality discussed above if only one example. Another example
> > would be block aggregation (which has its own name I can't recall right
> > now), so we can run a thread once a day that frees duplicated blocks and
> > make datasets to point at one copy only.
> 
> NTFS can do this within a single filesystem.
> 
> It feels to me almost like the opposite of ditto blocks :-) Though I 
> would still want ditto blocks to work with this.

My mine use will be for things like Solaris zones or FreeBSD jails. It
is nice to have one base file system, which you can just clone, but over
the time the clones are getting bigger and bigger. If you sell virtual
web servers for example and all you costumers upgrade apache to a new
version you end up with X copies of the same blocks and you lose
everything you saved by using clones initially. Beeing able to run a
process in the background every night, which will aggregate the blocks
back would be really nice.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
pjd at FreeBSD.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: 



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-12 Thread Matthew Ahrens
Pawel Jakub Dawidek wrote:
> On Fri, Oct 12, 2007 at 11:10:31AM +0100, Darren J Moffat wrote:
>> Pawel Jakub Dawidek wrote:
>>> I understand it's not trivial, but beeing able to reference the same
>>> block from different datasets would be a really nice feature to have.
>>> The functionality discussed above if only one example. Another example
>>> would be block aggregation (which has its own name I can't recall right
>>> now), so we can run a thread once a day that frees duplicated blocks and
>>> make datasets to point at one copy only.
>> NTFS can do this within a single filesystem.
>>
>> It feels to me almost like the opposite of ditto blocks :-) Though I 
>> would still want ditto blocks to work with this.
> 
> My mine use will be for things like Solaris zones or FreeBSD jails. It
> is nice to have one base file system, which you can just clone, but over
> the time the clones are getting bigger and bigger. If you sell virtual
> web servers for example and all you costumers upgrade apache to a new
> version you end up with X copies of the same blocks and you lose
> everything you saved by using clones initially. Beeing able to run a
> process in the background every night, which will aggregate the blocks
> back would be really nice.

Yeah, that would be nice.  De-duplication is on the list of problems we'd 
like to attack sooner rather than later.

--matt



[zfs-code] System call to create a clone of a file on a ZFS filesystem?

2007-10-25 Thread Torsten "Paul" Eichstädt
Quick shot: What's wrong with maintaining a reference count?
Free the block only when ref# is zero. 
Count the bytes in each fs it's in, but only once in (each of, recursively) the 
parent(s) -- maybe this is costly, 'cause now the parents size is not the sum 
of it's children.
I don't know about the mathematical characteristics of the ZFS checksum, but I 
assume it's good to detect bit-errors and might not be good enough to find data 
suitable for aggregation.

Paul
--
This messages posted from opensolaris.org