On Apr 15, 11:04 am, Nigel Rantor <wig...@wiggly.org> wrote: > The fact that two md5 hashes are equal does not mean that the sources > they were generated from are equal. To do that you must still perform a > byte-by-byte comparison which is much less work for the processor than > generating an md5 or sha hash. > > If you insist on using a hashing algorithm to determine the equivalence > of two files you will eventually realise that it is a flawed plan > because you will eventually find two files with different contents that > nonetheless hash to the same value. > > The more files you test with the quicker you will find out this basic truth. > > This is not complex, it's a simple fact about how hashing algorithms work.
The only flaw on a cryptographic hash is the increasing number of attacks that are found on it. You need to pick a trusted one when you start and consider replacing it every few years. The chance of *accidentally* producing a collision, although technically possible, is so extraordinarily rare that it's completely overshadowed by the risk of a hardware or software failure producing an incorrect result. -- http://mail.python.org/mailman/listinfo/python-list