On 4/17/12 12:19 AM, "Martin v. Löwis" wrote:
Am 17.04.2012 00:09, schrieb Tarek Ziadé:
On 4/16/12 11:57 PM, "Martin v. Löwis" wrote:
Maybe a better checksum would be a global hash calculated differently ?
Define a protocol, and I present you with an implementation that
conforms to the protocol, and still has inconsistent data, and not
in a malicious manner, but due to bugs/race conditions/unexpected
events. It's pointless.
if you calculate a checksum with all mirrored files - you can guarantee
that the bits are the same
on both side, no ?
How exactly would you calculate that checksum?
by calculating the grand hash of each file hash.
  Would you really require
concatenation of all files?
I did not say that. You are claiming it in a rhetorical question.

That could take a few hours per change.
why that ? you don't calculate the checksum of a file your already have twice.

Even if you do, it's very fast to call md5.

try it:

$ find mirror | xargs md5

this takes a few seconds at most on the whole mirror

It
would also raise the question in what order the files ought to be
concatenated.
Anything reproductible, a sorted list. In bash I *suspect* the calculation of the grand hash of the mirror is a one-liner that takes less than a minute.

I am going to stop here anyways because I don't see the point of discussing implementation details at this stage, since we were barely starting to talk about the idea of a checksum - and that seems to be going nowhere.

Cheers
Tarek



_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to