On Thu, Jan 14, 2010 at 13:22, Peter Corlett <ab...@cabal.org.uk> wrote: > For de-duping purposes, SHA is still faster than you can pull the files off > the disk and a secondary cheaper hash is unnecessary.
That reminds me of how I was disappointed to find that rsync generally transfers complete files (rather than diffs) if both source and destination are on a local file system -- before I realised that to compute the diffs, it would have to read the entire first and second files, and if it's going to read the entire first file from disk anyway, it can simply dump it over the second file without checking. Computing diffs would be more work in this case, not less. So yes, I suppose something similar applies here -- you have to read the entire file anyway, so you might as well go with SHA-$number_of_your_choice. Cheers, Philip -- Philip Newton <philip.new...@gmail.com>