Re: [Catalog-sig] PyPI mirrors are all up to date

Tarek Ziadé Mon, 16 Apr 2012 15:48:16 -0700

On 4/17/12 12:19 AM, "Martin v. Löwis" wrote:

Am 17.04.2012 00:09, schrieb Tarek Ziadé:

On 4/16/12 11:57 PM, "Martin v. Löwis" wrote:

Maybe a better checksum would be a global hash calculated differently ?

Define a protocol, and I present you with an implementation that
conforms to the protocol, and still has inconsistent data, and not
in a malicious manner, but due to bugs/race conditions/unexpected
events. It's pointless.

if you calculate a checksum with all mirrored files - you can guarantee
that the bits are the same
on both side, no ?

How exactly would you calculate that checksum?

by calculating the grand hash of each file hash.

  Would you really require
concatenation of all files?

I did not say that. You are claiming it in a rhetorical question.

That could take a few hours per change.

why that ? you don't calculate the checksum of a file your already havetwice.


Even if you do, it's very fast to call md5.

try it:

$ find mirror | xargs md5

this takes a few seconds at most on the whole mirror

It
would also raise the question in what order the files ought to be
concatenated.

Anything reproductible, a sorted list. In bash I *suspect* thecalculation of the grand hash of the mirror is a one-liner that takesless than a minute.

I am going to stop here anyways because I don't see the point ofdiscussing implementation details at this stage, since we werebarely starting to talk about the idea of a checksum - and that seems tobe going nowhere.


Cheers
Tarek



_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Re: [Catalog-sig] PyPI mirrors are all up to date

Reply via email to