Hello,
On Tue, Dec 02, 2014 at 03:50:22PM -0700, Leonid A.Movsesyan wrote:
> Hello,
>
> I'm using HAProxy's rand function in configs to generate random values for
> HTTP headers. The problem is that multiple HAProxies on the same server that
> where launched at the same time (it happens sometimes in my system) are using
> the same seed value and generate the same sequence of random numbers which
> causes some bugs for me.
Well, if same random numbers cause some bugs, you're having a deeper problem,
because by definition, random numbers are non-correlated, so they can very well
be identical for some sequences. Thus, if your workload doesn't support this,
maybe you don't want to use random numbers but something else (eg: a hash on
something or whatever).
Also, I'm doubting a little bit because the random depends on the millisecond
the process starts, and on its pid. Thus I find it hard to imagine that you
had the luck to encounter that situation between multiple machines.
> Here's a patch for src/sample.c that solved this problem. I also need the
> numbers to be as unique as it possible, so the following patch works with
> /dev/urandom, but it might be a minor issue and it might be ignored for
> performance reasons.
I'm sorry but that cannot work :
- it's out of question to *open* a file for each call to the random fetch!
- even if you accepted this, it would not work when haproxy is run inside
a chroot (which is the recommended way of deploying it)
Also :
> @@ -1331,7 +1334,33 @@
> smp_fetch_rand(struct proxy *px, struct session *s, void *l7, unsigned int
> opt,
> const struct arg *args, struct sample *smp, const char *kw)
> {
> - smp->data.uint = random();
> +int fd;
> +struct timeval tv;
> +
> +gettimeofday(&tv, 0);
> +
> +fd = open("/dev/urandom", O_RDONLY);
> +if (fd == -1) {
> +fd = open("/dev/random", O_RDONLY | O_NONBLOCK);
> +}
> +
> +srandom((getpid() << 16) ^ getuid() ^ tv.tv_sec ^ tv.tv_usec);
> +
> +/* Use random() only if /dev/urandom or /dev/random don't present
> +* or can't be used.
> +*/
What you did above is totally wrong :
- srandom() was called regardless of the fd test
- gettimeofday() was called regardless of the fd test
- getuid() has no chance of being different between identical LBs
- you don't produce random numbers anymore given that you call
srandom() for each random, what you output is the time of day
which is the only variable here between multiple calls.
> +if (fd >= 0) {
> +ssize_t x = read(fd, &smp->data.uint, sizeof (unsigned int));
> +
> +if (x <= 0) {
> +smp->data.uint = random();
> +}
> +
> +close(fd);
> +}
> +else {
> +smp->data.uint = random();
> +}
What you need to do is to improve the seed upon startup instead.
Simply using this once upon startup :
srandom(getpid() ^ tv.tv_sec ^ tv.tv_usec);
should be enough to ensure that multiple machines will run different
sequences.
Note that this is already equivalent to what is being done currently by
srandom(nowms-getpid()), except that it increases the time precision.
Regards,
Willy