On Tue, 2002-04-09 at 17:25, Martijn van Oosterhout wrote: > What you are suggesting is that the server store checksums for precalculated > blocks on the server. This would be 4 bytes per 1k in the original file or > so. The transaction proceeds as follows: > > 1. Client asks for checksum list off server > 2. Client calculates checksums for local file > 3. Client compares list of server with list of client > 4. Client downloads changed regions. > > Note, this is not the rsync algorithm, but the one that is possibly > patented.
This looks like an interesting algorithm, so I decided to compare it to the diff scheme analyzed in http://lists.debian.org/debian-devel/2002/debian-devel-200204/msg00502.html The above message also gives my analysis methodology. The results: ------------ - The following table summarizes the performance of the checksum-based scheme and the diff-based scheme under the assumption that users tend to perform apt-get update often. I think disk space is cheap and bandwidth is expensive, so 20 days of diffs is the best choice. Scheme Disk space Bandwidth ----------------------------------------------------------- Checksums (bwidth optimal) 26K 81K diffs (4 days) 32K 331K diffs (9 days) 71K 66K diffs (20 days) 159K 27K - The analysis is unfairly favorable to the checksum scheme, because I do not count the bandwidth required to request all the changed blocks, only the bandwidth used to transmit the changed blocks. - For the user model in the message above, the optimal block size for this algorithm is around 245 bytes . - In the diff-based scheme, each mirror can decide on a diskspace/bandwidth tradeoff by simply keeping more old diffs or deleting some old diffs. The checksum-based scheme doesn't really support tweaking at the mirror. - I tend to update every day. For people who update every day, the diff-based scheme only needs to transfer about 8K, but the checksum-based scheme needs to transfer 45K. So for me, diffs are better. :) Best, Rob -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]