On Thu, Jun 10, 2010 at 12:21:02PM -0600, Lori Alt wrote:
> It's not possible to implement unless we establish a bidirectional 
> communication between the sending and receiving side.  The logic for 
> send-stream dedup is:
> 
> for (each block to be written to stream) {
> 
>    get the block's checksum
> 
>    lookup the block's checksum in the dedup-table
>         established for *this* stream generation
> 
>    if (an entry in the DDT exists for this checksum)
> 
>        send a "write-by-reference" block across the stream
>             (this contains a reference to a block send earlier in the
>        stream)
> 
>    else {
> 
>        add an entry for this block to the DDT
> 
>        send the full block
> 
>    }
> 
> }
> 
> Since the dedup table on sending side only knows about blocks already 
> send in the stream, we have no way of knowing whether a copy of the 
> block already exists on the other side, and even if we did know, we 
> wouldn't know where it was on the other side.  The sending side would 
> have to have a copy of the other side's on-disk DDT to know whether a 
> write-by-reference could be used.

If we send incremental stream we can be sure that up to the previous
snapshot we have the same data on the other side. I'm aware it doesn't
mean the data has exactly the same checksum (eg. it can be compressed
with different algorithm). But in theory, are we able to figure out that
the given block we try to send is already part of the dataset's previous
snapshot? I'm fine with discarding incremental stream on the remote site
if it uses different compression algorithm or simply deduplication is
turned off (bascially when there is no block matching stored checksum).
But if I have identical configurations on both ends I'd like not to send
the same block multiple times in multiple incremental streams.

-- 
Pawel Jakub Dawidek                       http://www.wheelsystems.com
[email protected]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Attachment: pgpCEgrUf7lEE.pgp
Description: PGP signature

_______________________________________________
zfs-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-code

Reply via email to