On Thu, Jun 10, 2010 at 12:21:02PM -0600, Lori Alt wrote:
> It's not possible to implement unless we establish a bidirectional
> communication between the sending and receiving side. The logic for
> send-stream dedup is:
>
> for (each block to be written to stream) {
>
> get the block's checksum
>
> lookup the block's checksum in the dedup-table
> established for *this* stream generation
>
> if (an entry in the DDT exists for this checksum)
>
> send a "write-by-reference" block across the stream
> (this contains a reference to a block send earlier in the
> stream)
>
> else {
>
> add an entry for this block to the DDT
>
> send the full block
>
> }
>
> }
>
> Since the dedup table on sending side only knows about blocks already
> send in the stream, we have no way of knowing whether a copy of the
> block already exists on the other side, and even if we did know, we
> wouldn't know where it was on the other side. The sending side would
> have to have a copy of the other side's on-disk DDT to know whether a
> write-by-reference could be used.If we send incremental stream we can be sure that up to the previous snapshot we have the same data on the other side. I'm aware it doesn't mean the data has exactly the same checksum (eg. it can be compressed with different algorithm). But in theory, are we able to figure out that the given block we try to send is already part of the dataset's previous snapshot? I'm fine with discarding incremental stream on the remote site if it uses different compression algorithm or simply deduplication is turned off (bascially when there is no block matching stored checksum). But if I have identical configurations on both ends I'd like not to send the same block multiple times in multiple incremental streams. -- Pawel Jakub Dawidek http://www.wheelsystems.com [email protected] http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am!
pgpCEgrUf7lEE.pgp
Description: PGP signature
_______________________________________________ zfs-code mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-code
