On 13/03/2011, at 7:53 AM, DAZ wrote:
How likely is rand(36**8).to_s(36) to have a collision compared to
truncating UUIDTools::UUID.random_create?

If you have N possible values, you have an approximately 50/50 chance
of a collision in sqrt(N) perfectly random values. Or to put it another way, if you have an ID of 2*M bits, you get to 50/50 at in a population of 2^M
values. This is called the birthday paradox, google for it.

There's a good reason why UUIDs contain 128 bits... (and MD5... and
SHA-1 hashes are 160, SHA-256 is 256, etc).

I realise that with smaller strings the chances of collision are
larger. How do sites like disqus and bit.ly make their short urls?


They probably use a character encoding of a database ID.

You should probably use a character encoding likewise. Here's a snippet
I've used, which reduces a full UUID to 20 characters (using base-91):

require 'sysuuid'

class Integer
  def base(b)
    self < b ? [self] : (self/b).base(b) + [self%b]
  end
end

BASE91 = '!#$%&()*+,-./0123456789:;<=>? @ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~'

    uuid = sysuuid.gsub(/-/,'').reverse.hex
    uuid20 = uuid.base(91).map{|i| BASE91[i].chr }*''
uuid_again = uuid20.split(//).inject(0){|i,e| i*91 + BASE91.index(e[0]) }

Clifford Heath.

--
You received this message because you are subscribed to the Google Groups 
"DataMapper" group.
To post to this group, send email to datamapper@googlegroups.com.
To unsubscribe from this group, send email to 
datamapper+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/datamapper?hl=en.

Reply via email to