>>>>> "Tkil" == Tkil <[EMAIL PROTECTED]> writes:

Tkil> but the chance of any collision at all wigs me out.

>>>>> "Paul" == Paul Jackson <[EMAIL PROTECTED]> writes:

Paul> Guess you're just going to get wigged out then.

Wig wig.  :)

I didn't mean "wigs me out to the point I won't use it" but more of
"wigs me out so that I'm curious whether there are backup schemes
worth considering".

In particular, the comparisons between hash collisions and hardware
failure seem contrived -- if I have bad RAM, or a bad block on my HD,
I can recover it from known good sources.  But if the actual known
good source is structured in such a way that a particular set of data
cannot be represented, that bothers me.

In this case, the fact that it has to be the same length, same SHA-1,
correct C, and functionally similar C at that, makes for a comforting
cushion.  Further, git wouldn't be the only representation; there
would be periodic tarballs, different trees, etc.

On the other paw, if "effectively random" MS Word docs gave true MD5
collisions (when we have a proper MD5 hash computed over the entire
document) in a "mere" 1e7 space, that is interesting/scary.

(I was also trying to add a few factoids to the MSW comment, as their
structure could lead to collisions if (say) only the first 512 bytes
were considered -- it's possible that nothing but size and date might
change in that, and /those/ I can see colliding in 1e7 documents.)

Finally, I apologize for taking your time.  I'm just watching this
from the sidelines, and the questions above are just intellectual
curiosity.  :-/

(The only other thread I'm really following is people trying to chunk
files in a way that would increase storage efficiency; reading the
Venti paper, I was wondering how efficient it would be if a one-byte
addition at the top of the file would generate all-new blocks, while
the rsync-ish protocol seems to offer substantial relief.  But if the
"interesting history" fits in 10USD worth of HD, that might be enough.
Babble.)

Thanks,
t.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to