> > I still cannot fathom why anyone would assign random numbers or (even > more useless) long random blobs to use as psuedo-keys. It just boggles > the mind. > I take it you’re not a cryptographer :) All modern ciphers do this. For > example, an RSA key pair is simply a pair of large random numbers (both > prime) that meet certain criteria. Or if you use a more modern cipher like > Curve25519, the private key is quite literally just any 256 bits of random > data. You generate a key-pair by reading 32 bytes from /dev/random into > the private key, and then performing a transformation on that to get the > public key.
I know and understand the uses of random numbers, encryption, and digests when used for the purpose for which they were invented. What I do not understand is why one would use a UUID (randomly generated bunch of bytes) as a key in a database. It is long, every use must be checked for collisions, and inherently far less efficient than the simple integer sequence it is replacing. Of course, it is just a fad (like big huge wastes of whitespace and unreadable low-contrast ittybitty fonts in current web-page bootifications) adopted by those unable to comprehend the consequences of their decisions (and if they haven't had any yet, they are very lucky indeed). > Obviously collisions are possible with long random numbers or digests, but > secure systems are designed such that random collisions are vanishingly > unlikely to occur for {insert large power of ten here} years, which makes > the probability effectively zero. No, you are incorrect. A "good hash function" will evenly spread its collisions over its digest space. If you feed all possible 512-bit blocks into a 512-bit hash to obtain the output digests, when you feed in one more 513-bit input, you will get one collision. If you feed in another 513-bit input you will get a different collision. The "collision" digest will not be predictable (that is it will not "just always be the same as the first 512-bit blocks input digest with bit 438 flipped). It is the property of being unable (very complex and taking a long time) to generate an input (chosen text) which results in a specific digest which is the useful property -- the fact that it can and must have a 100% probability of collision when the input space is larger than the output space is irrelevant. THe problem is an inability to properly determine and assess risk. When using a sequence the probability of a collision is 0. When using a random generated number (passing a bunch of random data through a digest function) has a probability of collision of 100%. Only if you have (for example) a sequence assigned "systemid" which is used as part of the input to the digest function, and use the generated recordid sequence number as input to the digest along with the random data does the probability of collision reduce from 100% to some small number greater than 0%. Using the systemid sequence and the recordid sequence directly however, has a 0% probability of collision, so any rational person would use that directly and forgo entirely the introduction of uncertainty and bugs using "UUID" type crappola will cause. Unfortunately there is a massive shortage of rational life on this planet. > —Jens > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users