On Wed, Aug 03, 2022 at 04:21:43PM -0500, [email protected] wrote:
> >On Wed, Aug 03, 2022 at 08:26:20AM -0600, Theo de Raadt wrote:
> >> [email protected] wrote:
> >>
> >> > Another way to solve this problem would be to trim the numbers with
> >> > something like this: if (denom > UINT32_MAX) denom = UINT32_MAX.
> >>
> >> And then document that the program returns incorrect results?
>
> [email protected] wrote:
>
> >See this. There's also linked BSD-licensed code that should not be hard
> >to adapt.
> >
> >https://mumble.net/~campbell/2014/04/28/uniform-random-float?
> >
> >Then pick a line if the random number drawn is < 1/denom.
>
> So now we have two options. (1) We use the trim trick and document
> the range of values that the denominator can be in. (2) We use the
> code linked by tb@ to generate uniform doubles. (see patch below)
>
> (2) seems to work but I need to spend some more time to understand
> the algorithm behind it.
>
> (1) seems more robust cosidering that the man page will also contain
> information about the range of denom.
Thanks.
I have no strong opinion. I'm fine with either approach. It's such a
silly program...
Since you've already done 90% of the work for (2), we can push it over
the line. This will need adding Taylor R. Campbell's license since a
significant chunk of code is included verbatim. Add the license below
the existing one.
As an aside, random -e has been completely broken (it's non-uniform)
since forever. To fix -e, we should clamp denom to an integer between 1
and 256, otherwise the truncation of the exit exit code to an 8-bit int
introduces bias for numbers larger than 256 (that aren't powers of 2).
Restricting denom >= 1 would seem necessary for all modes of this
program given that 1/denominator is documented to be a probability.
The patch doesn't compile due to missing clz64. Instead of using a
__builtin (which I'm unsure if it's available on all our ancient gccs),
I think we can implement this naively. Something along these lines
should be good enough (only very lightly tested):
int
clz64(uint64_t x)
{
static const uint64_t mask[] = {
0xffffffff00000000, 0xffff0000, 0xff00, 0xf0, 0xc, 0x2,
};
static int zeroes[] = {
32, 16, 8, 4, 2, 1,
};
int clz = 0;
int i;
if (x == 0)
return 64;
for (i = 0; i < 6; i++) {
if ((x & mask[i]) == 0)
clz += zeroes[i];
else
x >>= zeroes[i];
}
return clz;
}
Some more comments inline.
>
> Index: random.c
> ===================================================================
> RCS file: /cvs/src/games/random/random.c,v
> retrieving revision 1.20
> diff -u -p -r1.20 random.c
> --- random.c 7 Mar 2016 12:07:56 -0000 1.20
> +++ random.c 3 Aug 2022 20:28:08 -0000
> @@ -38,6 +38,75 @@
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> +#include <stdint.h>
> +#include <math.h>
style: keep these in alphabetical order
> +
> +uint64_t random64(void)
style: tab should be a newline:
uint64_t
random64(void)
> +{
> + uint64_t r;
style: add empty line
> + arc4random_buf(&r, sizeof(uint64_t));
and here
> + return r;
> +}
> +
> +/*
> + * random_real: Generate a stream of bits uniformly at random and
> + * interpret it as the fractional part of the binary expansion of a
> + * number in [0, 1], 0.00001010011111010100...; then round it.
> + */
> +double
> +random_real(void)
> +{
> + int exponent = -64;
> + uint64_t significand;
> + unsigned shift;
> +
> + /*
> + * Read zeros into the exponent until we hit a one; the rest
> + * will go into the significand.
> + */
> + while (__predict_false((significand = random64()) == 0)) {
> + exponent -= 64;
> + /*
> + * If the exponent falls below -1074 = emin + 1 - p,
> + * the exponent of the smallest subnormal, we are
> + * guaranteed the result will be rounded to zero. This
> + * case is so unlikely it will happen in realistic
> + * terms only if random64 is broken.
> + */
> + if (__predict_false(exponent < -1074))
> + return 0;
> + }
> +
> + /*
> + * There is a 1 somewhere in significand, not necessarily in
> + * the most significant position. If there are leading zeros,
> + * shift them into the exponent and refill the less-significant
> + * bits of the significand. Can't predict one way or another
> + * whether there are leading zeros: there's a fifty-fifty
> + * chance, if random64 is uniformly distributed.
> + */
> + shift = clz64(significand);
> + if (shift != 0) {
> + exponent -= shift;
> + significand <<= shift;
> + significand |= (random64() >> (64 - shift));
> + }
> +
> + /*
> + * Set the sticky bit, since there is almost surely another 1
> + * in the bit stream. Otherwise, we might round what looks
> + * like a tie to even when, almost surely, were we to look
> + * further in the bit stream, there would be a 1 breaking the
> + * tie.
> + */
> + significand |= 1;
> +
> + /*
> + * Finally, convert to double (rounding) and scale by
> + * 2^exponent.
> + */
> + return ldexp((double)significand, exponent);
> +}
>
> __dead void usage(void);
>
> @@ -86,7 +155,7 @@ main(int argc, char *argv[])
>
> /* Compute a random exit status between 0 and denom - 1. */
> if (random_exit)
> - return (arc4random_uniform(denom));
> + return random_real();
This isn't right. A double between 0 and 1 will result in a return
value of 0. For now you could do 'return random_real() * denom;'
>
> /*
> * Act as a filter, randomly choosing lines of the standard input
> @@ -101,7 +170,7 @@ main(int argc, char *argv[])
> * 0 (which has a 1 / denom chance of being true), we select the
> * line.
> */
> - selected = arc4random_uniform(denom) == 0;
> + selected = random_real() < (1 / denom);
I'd omit these parentheses.
> while ((ch = getchar()) != EOF) {
> if (selected)
> (void)putchar(ch);
> @@ -111,7 +180,7 @@ main(int argc, char *argv[])
> err(2, "stdout");
>
> /* Now see if the next line is to be printed. */
> - selected = arc4random_uniform(denom) == 0;
> + selected = random_real() < (1 / denom);
> }
> }
> if (ferror(stdin))