On Sat, Jun 07, 2014 at 12:03:29AM +0300, Matti Nykyri wrote:
> On Thu, Jun 05, 2014 at 10:58:51PM -0500, Canek Peláez Valdés wrote:
> > On Thu, Jun 5, 2014 at 9:56 PM,  <meino.cra...@gmx.de> wrote:
> > > Hi,
> > >
> > > I am experimenting with the C code of the ISAAC pseudo random number 
> > > generator
> > > (http://burtleburtle.net/bob/rand/isaacafa.html).
> > >
> > > Currently the implementation creates (on my embedded linux) 32 bit
> > > hexadecimal output.
> > 
> > So it's a 32 bit integer.
> > 
> > > From this I want to create random numbers in the range of [a-Za-z0-9]
> > > *without violating randomness* and (if possible) without throwing
> > > away bits of the output.
> > 
> > You mean *characters* int the range [A-Za-z0-9]?
> 
> Well this isn't as simple problem as it sounds. A random 32 bit integer 
> has 32 bits of randomness. If you take a divison reminder of 62 from this 
> integer you will get only 5,95419631039 bits of randomness 
> (log(62)/log(2)). So you are wasting 81,4% of your random data. Which is 
> quite much and usually random data is quite expensive. You can save your 
> precious random data by taking only 6 bit from your 32 bit integer and 
> dividing it by 62. Then you will be wasting only 0,8% of random data. 
> Another problem is alignment, but that is about mathematical correctness.
> 
> > > How can I do this mathemtically (in concern of the quality of output)
> > > correct?
> > 
> > The easiest thing to do would be:
> 
> The easiest is not mathematically correct though. Random data will stay 
> random only if you select and modify it so that randomness is preserved. 
> If you take devison reminder of 62 from 32 bit integer there are 69 273 
> 667 possibilities of the reminder to be 3 or less. For the reminder to 4 
> or more the number of possibilities is 69 273 666. In mathematically 
> ideal case the probability for every index of the list should be same: 
> 1/62 = 1,61290322581%. But the modulo 62 modifies this probability: for 
> index 0-3 the probability is 69 273 667/2^32 = 1,61290324759%. And for 
> indexes 4-61 the probability will be 69 273 666/2^32 = 1,6129032243%.
> 
> If you wish not to waste those random bits the probabilities will get 
> worse. With 6 bits of random the probability for index 0-1 will be 2/64 
> and for 2-63 it will be 1/64. This is a very significant change because 
> first and second index will appear twice as much as the rest. If you add 
> 2 characters to your list you will perfect alignment and you can take 6 
> bits of data without it modifying probabilities.
> 
> If you are looking a mathematically perfect solution there is a simple 
> one even if your list is not in the power of 2! Take 6 bits at a time of 
> the random data. If the result is 62 or 63 you will discard the data and 
> get the next 6 bits. This selectively modifies the random data but keeps 
> the probabilities in correct balance. Now the probability for index of 
> 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0.
> 
> > -------------------------------------------------------------------------------
> > #include <time.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > 
> > #define N (26+26+10)
> > 
> > static char S[] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> > 'K', 'L', 'M',
> >                     'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
> > 'X', 'Y', 'Z',
> >                     'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> > 'k', 'l', 'm',
> >                     'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
> > 'x', 'y', 'z',
> >                     '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
> > 
> > int
> > next_character()
> > {
> >         // Use the correct call for ISAAC instead of rand()
> >         unsigned int idx = rand() % N;
> >         return S[idx];
> > }
> 
> so modify the next_char function:
> 
> char next_character()
> {
>       static unsigned int rand = 0; //(sizeof(int) = 32)
>       static char bit_avail = 0;
>       char result = 0;
>       char move_bits = 0;
>       char bits_moved = 0;
> 
>       do {
>               if (!bits_avail) {
>                       // Use the correct call for ISAAC instead of rand()
>                       rand = rand();
>                       
>                       bit_avail = 32;
>               }
> 
>               move_bits = bits_avail >= 6 ? 6 : bits_avail;
>               result <<= move_bits;
>               result = (result | rand & (0xFF >> (8 - move_bits))) & 0x3F;
>               bits_avail -= move_bits;
>               bits_moved += move_bits;
>               rand >>= move_bits;
> 
>       } while (bits_moved != 6 && result > 61);
> 
>       return result;
> }

Well actually it looks simpler if you break this like this:

unsigned char get_6bits () 
{
        static unsigned int rand = 0; //(sizeof(int) = 32)
        static char bits_avail = 0;
        unsigned char result = 0;

        //get 2 bits 3 times: 32 is devidable by 2
        for (int i = 0; i < 3; i++) { // --std=c99
                //Fill buffer if it is empty!
                if (!bits_avail || bits_avail < 0 ) { //if bits_avail < 0 it is 
an error!
                        // Use the correct call for ISAAC instead of rand()
                        rand = rand();
                        
                        bits_avail = 32;
                }

                result <<= 2; //move two bits to left.
                result = result | (rand & 0x3); //add two least signifigant 
bits to the result.
                rand >>= 2; //move two bits to right.
                bits_avail -= 2;
        }

        return result; //result has 6 bits of random data...
}

char next_character()
{
        unsigned char idx = 0;
        do {
                idx = get_6bits();
        } while (idx > 61);

        return S[idx];
}

Very simple :)

> This function will give perfect distribution of 1/62 probability for 
> every index. It will waste 6 bits with the probability of 1/32 (2/64).
> 
> > int
> > main(int argc, char* argv[])
> > {
> >         // Use the correct call for initializing the ISAAC seed
> >         srand((unsigned int)time(NULL));
> >         for (int i = 0; i < 20; i++) // --std=c99
> >                 printf("%c\n", next_character());
> >         return 0;
> > }
> > -------------------------------------------------------------------------------
> > 
> > If the ISAAC RNG has a good distribution, then the next_character()
> > function will give a good distribution among the set [A-Za-z0-9].
> > 
> > Unless I missunderstood what you meant with "create random numbers in
> > the range of [a-Za-z0-9]".

-- 
-Matti

Reply via email to