On 6/10/2010 1:21 PM, Pawel Jakub Dawidek wrote:

If we send incremental stream we can be sure that up to the previous
snapshot we have the same data on the other side. I'm aware it doesn't
mean the data has exactly the same checksum (eg. it can be compressed
with different algorithm). But in theory, are we able to figure out that
the given block we try to send is already part of the dataset's previous
snapshot? I'm fine with discarding incremental stream on the remote site
if it uses different compression algorithm or simply deduplication is
turned off (bascially when there is no block matching stored checksum).
But if I have identical configurations on both ends I'd like not to send
the same block multiple times in multiple incremental streams

No, you can't be sure. You can *assume* you sent the proper incremental stream to the receiving host, but what if you didn't? Or it got deleted? etc.

You *have* to check with receiving host to see what's there. As Lori pointed out, you need the DDT from the receiving host. As I said earlier, this looks to NOT need code changes, just a smart userland app. I'd use rsync's model, where you SSH over to the other host, run the same binary (which knows it's in "receive" mode), and set up the com link between the two. The receiver's DDT gets generated, passed back to the sender, and the sender can then do lookups using both DDT sets. It's really not that complicated.

My sole worry is that since 'zfs send' and 'zfs receive' are moving targets to keep up with the zfs filesystem version features, you'll have to constantly modify your new app to be compatible with newer zfs versions.

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-code

Reply via email to