Kai Krakow <hurikha...@gmail.com> schrieb: > Matti Nykyri <matti.nyk...@iki.fi> schrieb: > >> If you are looking a mathematically perfect solution there is a simple >> one even if your list is not in the power of 2! Take 6 bits at a time of >> the random data. If the result is 62 or 63 you will discard the data and >> get the next 6 bits. This selectively modifies the random data but keeps >> the probabilities in correct balance. Now the probability for index of >> 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. > > Why not do just something like this? > > index = 0; > while (true) { > index = (index + get_6bit_random()) % 62; > output << char_array[index]; > } > > Done, no bits wasted. Should have perfect distribution also. We also don't > have to throw away random data just to stay within unaligned boundaries. > The unalignment is being taken over into the next loop so the "error" > corrects itself over time (it becomes distributed over the whole set).
It is worth noting that my approach has the tendency of generating random characters in sequence. I think there are several possibilities to fix this, one being swapping two elements in the char array on each access. That fixed this particular problem in my tests. Here's are quick ruby snippet I played with: $ cat random.rb # danger: ugly code ahead dist = [0] * 62 seq = [0, 0] chars = [*('A'..'Z'), *('a'..'z'), *('0'..'9')] orig_chars = chars.dup index = 0 old = nil (62 * 10).times do # generate random wrapped index index = (index + rand(64)) % 62 # swap two indexed positions and output the char chars[0], chars[index] = chars[index], chars[0] $stdout.write chars[0] # count the generated indexes dist[index] = dist[index] + 1 # count if indexes were generated in sequence by looking # them up in the original sorted chars array unless old.nil? seq[0] = seq[0] + 1 if old < orig_chars.index(chars[0]) seq[1] = seq[1] + 1 if old > orig_chars.index(chars[0]) end old = orig_chars.index(chars[0]) end puts # check the average, it should always equal the loop count # multiplicator avg = dist.inject(:+) / 62.0 # calculate if the algorithm preferred sequences seq = [*seq, seq[0] - seq[1]] # output some stats puts avg puts dist.inspect puts dist.map {|v| avg - v }.inspect puts seq.inspect -- Replies to list only preferred.