On Tue, Apr 22, 2014 at 6:57 PM, Stephan Beal <sgb...@googlemail.com> wrote: > On Tue, Apr 22, 2014 at 6:48 PM, Richard Hipp <d...@sqlite.org> wrote: >> Fossil generates some of its "GUID"s using the SHA1 hash algorithm. Other >> GUIDs (for example for ticket IDs) are generated using: >> >> SELECT lower(hex(randomblob(20))); >> >> You can increase the 20 to make the GUIDs as "globally unique" as you >> want. The GUIDs discussed previously in this thread seem use 16 instead of >> 20 and thus are less unique. >> > > That reminds me of a specific snippet from this article: > > http://www.w3.org/DesignIssues/Axioms.html#nonunique > > In summary: the context of a GUID defines its "scope of required > uniqueness," and a 16-byte GUID is essentially globally unique so long as > it has no collisions within its context(s). (i.e. who cares if SHA1s > collide, so long as it's not in the same repo?)
First, SHA1 hashes and GUID, although they look the same (size notwithstanding), are not the same. Hashes like SHA1 derive their value from actual content (at a point in time), so they are in fact better than randomly generated GUIDs. But not every applications can easily compute content hashes (using SHA1, SHA256, or whatever other secure hashing algo) for their content. And for mutable entities, content hashes would be definition also mutate (ignoring very unlikely collisions), unlike GUIDs which are arbitrary and immutable "by design", which makes them suitable as PKs of mutate entities. Regarding the uniqueness argument made by DRH, it's actually very hard to generate 2 random-based GUIDS, given that a 128-bit is a very very large number. It is said that 128-bit is large enough to store the estimated number of atoms in our galaxy. It's good enough for my own uses. Being of the curious type, I wrote a little test to generate a large number of GUIDs (using boost::uuid), then sort them, then look for the longest prefix (byte-wise, not char wise). To keep things simple, I did that in memory, so could only generate 1/2 a billion, and the longest common prefix I found was 7 bytes, out of the 16 bytes. Intuitively, I suspect one must generate increasingly large number of GUIDs to increase the common prefix length by 1 byte each time, but I didn't verify this intuition. So yes, in theory, one will eventually run out of bits using a 128-bit (integer) GUID, but in practice I don't think it hardly matters.My $0.02. --DD _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users