On Sat, Dec 19, 2009 at 17:20, Bob Friesenhahn <bfrie...@simple.dallas.tx.us
> wrote:

> On Sat, 19 Dec 2009, Colin Raven wrote:
>
>>
>> There is no original, there is no copy. There is one block with reference
>> counters.
>>
>> - Fred can rm his "file" (because clearly it isn't a file, it's a filename
>> and that's all)
>> - result: the reference count is decremented by one - the data remains on
>> disk.
>>
>
> While the similarity to hard links is a good analogy, there really is a
> unique "file" in this case.  If Fred does a 'rm' on the file then the
> reference count on all the file blocks is reduced by one, and the block is
> freed if the reference count goes to zero.  Behavior is similar to the case
> where a snapshot references the file block.  If Janet updates a block in the
> file, then that updated block becomes unique to her "copy" of the file (and
> the reference count on the original is reduced by one) and it remains unique
> unless it happens to match a block in some other existing file (or snapshot
> of a file).
>

Wait...whoah, hold on.
If snapshots reside within the confines of the pool, are you saying that
dedup will also count what's contained inside the snapshots? I'm not sure
why, but that thought is vaguely disturbing on some level.

Then again (not sure how gurus feel on this point) but I have this probably
naive and foolish belief that snapshots (mostly) oughtta reside on a
separate physical box/disk_array..."someplace else" anyway. I say "mostly"
because I s'pose keeping 15 minute snapshots on board is perfectly OK - and
in fact handy. Hourly...ummm, maybe the same - but Daily/Monthly should
reside "elsewhere".

>
> When we are children, we are told that sharing is good.  In the case or
> references, sharing is usually good, but if there is a huge amount of
> sharing, then it can take longer to delete a set of files since the mutual
> references create a "hot spot" which must be updated sequentially.


Y'know, that is a GREAT point. Taking this one step further then - does that
also imply that there's one "hot spot" physically on a disk that keeps
getting read/written to? if so then your point has even greater merit for
more reasons...disk wear for starters, and other stuff too, no doubt.


> Files are usually created slowly so we don't notice much impact from this
> sharing, but we expect (hope) that files will be deleted almost
> instantaneously. <http://www.GraphicsMagick.org/>

Indeed, that's is completely logical. Also, something most of us don't spend
time thinking about.

Bob, thanks. Your thoughts and insights are always interesting - and usually
most revealing!
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to