On 13/03/2011, at 7:53 AM, DAZ wrote:
How likely is rand(36**8).to_s(36) to have a collision compared to
truncating UUIDTools::UUID.random_create?
If you have N possible values, you have an approximately 50/50 chance
of a collision in sqrt(N) perfectly random values. Or to put it
another way,
if you have an ID of 2*M bits, you get to 50/50 at in a population of
2^M
values. This is called the birthday paradox, google for it.
There's a good reason why UUIDs contain 128 bits... (and MD5... and
SHA-1 hashes are 160, SHA-256 is 256, etc).
I realise that with smaller strings the chances of collision are
larger. How do sites like disqus and bit.ly make their short urls?
They probably use a character encoding of a database ID.
You should probably use a character encoding likewise. Here's a snippet
I've used, which reduces a full UUID to 20 characters (using base-91):
require 'sysuuid'
class Integer
def base(b)
self < b ? [self] : (self/b).base(b) + [self%b]
end
end
BASE91 = '!#$%&()*+,-./0123456789:;<=>?
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~'
uuid = sysuuid.gsub(/-/,'').reverse.hex
uuid20 = uuid.base(91).map{|i| BASE91[i].chr }*''
uuid_again = uuid20.split(//).inject(0){|i,e| i*91 +
BASE91.index(e[0]) }
Clifford Heath.
--
You received this message because you are subscribed to the Google Groups
"DataMapper" group.
To post to this group, send email to datamapper@googlegroups.com.
To unsubscribe from this group, send email to
datamapper+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/datamapper?hl=en.