David Eppstein wrote:

> In article <[EMAIL PROTECTED]>,
>  "Xah Lee" <[EMAIL PROTECTED]> wrote:
> 
>> a absolute requirement in this problem is to minimize the number of
>> comparison made between files. This is a part of the spec.
> 
> You need do no comparisons between files.  Just use a sufficiently 
> strong hash algorithm (SHA-256 maybe?) and compare the hashes.

I did it as follows (some time ago):

is filesize in hash?
        
        calculate md5 (and store), if equal then compare
        files.

store info in hash.

In some cases if might be faster to drop the md5 (since it reads all data)

-- 
John                   Small Perl scripts: http://johnbokma.com/perl/
               Perl programmer available:     http://castleamber.com/
            Happy Customers: http://castleamber.com/testimonials.html
                        
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to