Re: rand48 replacement

2021-11-29 Thread Fabien COELHO
Pushed with that change and some others, notably: Thanks for the improvements and the push! -- Fabien.

Re: rand48 replacement

2021-11-28 Thread Tom Lane
Fabien COELHO writes: >> Good, but then you shouldn't write associated code as if that's still >> a problem, because you'll cause other people to think it's still a >> problem and write equally contorted code elsewhere. "v & 1" is a >> transparent way of producing a bool, while this code isn't.

Re: rand48 replacement

2021-11-28 Thread Fabien COELHO
Hello, They are not more nor less relevant than any other "random" constant. The state needs a default initialization. The point of using DK's is that it is somehow cannot be some specially crafted value which would have some special property only know to the purveyor of the constant and could

Re: rand48 replacement

2021-11-27 Thread Tom Lane
Fabien COELHO writes: >> How did Knuth get into this? This algorithm is certainly not his, >> so why are those constants at all relevant? > They are not more nor less relevant than any other "random" constant. The > state needs a default initialization. The point of using DK's is that it > is

Re: rand48 replacement

2021-11-27 Thread Fabien COELHO
Hello Tom, Thanks for the feedback. +/* use Donald Knuth's LCG constants for default state */ How did Knuth get into this? This algorithm is certainly not his, so why are those constants at all relevant? They are not more nor less relevant than any other "random" constant. The state

Re: rand48 replacement

2021-11-26 Thread Tom Lane
I wrote: > ... What we do need is a decision > about what to do on Windows. We could write it like > +#ifndef WIN32 > + srandom(pg_prng_i32(_global_prng_state)); > +#endif > but I have a different modest suggestion: add > #define srandom(seed) srand(seed) > in win32_port.h. As far as I can

Re: rand48 replacement

2021-11-26 Thread Tom Lane
Aleksander Alekseev writes: > It looks like the patch is in pretty good shape. I noticed that the > return value of pg_prng_strong_seed() is not checked in several > places, also there was a typo in pg_trgm.c. The corrected patch is > attached. Assuming the new version will not upset cfbot, I

Re: rand48 replacement

2021-11-22 Thread Aleksander Alekseev
Hi hackers, > > > I guess the declaration needs PGDLLIMPORT. > > Indeed, thanks! > > Attached v16 adds that. It looks like the patch is in pretty good shape. I noticed that the return value of pg_prng_strong_seed() is not checked in several places, also there was a typo in pg_trgm.c. The

Re: rand48 replacement

2021-09-30 Thread Fabien COELHO
I guess the declaration needs PGDLLIMPORT. Indeed, thanks! Attached v16 adds that. -- Fabien.diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index d19f73127c..b250ae912b 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@

Re: rand48 replacement

2021-09-30 Thread Thomas Munro
On Thu, Sep 30, 2021 at 9:23 PM Fabien COELHO wrote: > The missing symbol is really defined in common/pg_prng.c which AFAICT is > linked with postgres. > > If someone experienced with the windows compilation chain could give a > hint of what is needed, I'd appreciate it! I guess the declaration

Re: rand48 replacement

2021-09-30 Thread Fabien COELHO
Attached v15 also does call srandom if it is there, and fixes yet another remaining random call. I think that I have now removed all references to "random" from pg source. However, the test still fails on windows, because the linker does not find a global variable when compiling

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
Just FTR, I strongly object to your removal of process-startup srandom() calls. Ok. The point of the patch is to replace and unify the postgres underlying PRNG, so there was some logic behind this removal. FTR, this was triggered by your comment on Jul 1: [...] I see that you probably

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
Hello Tom, Just FTR, I strongly object to your removal of process-startup srandom() calls. Ok. The point of the patch is to replace and unify the postgres underlying PRNG, so there was some logic behind this removal. Those are not only setting the seed for our own use, but also ensuring

Re: rand48 replacement

2021-09-25 Thread Tom Lane
Fabien COELHO writes: > Not enough. Here is a v14 which might improve things further more again. > Sorry for this noise due to blind windows tests. Just FTR, I strongly object to your removal of process-startup srandom() calls. Those are not only setting the seed for our own use, but also

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
[1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Not enough. Here is a v12 which might improve things further. Not enough. Here is a v13 which might improve things further more. Not

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
[1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Not enough. Here is a v12 which might improve things further. Not enough. Here is a v13 which might improve things further more. --

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
Hello again, Just wanted to let you know that cfbot [1] doesn't seem to be happy with the patch. Apparently, some tests are falling. To be honest, I didn't invest too much time into investigating this. Hopefully, it's not a big deal. [1]: http://cfbot.cputube.org/ Indeed. I wish that these

Re: rand48 replacement

2021-09-25 Thread Fabien COELHO
Hello Aleksander, Attached a v10 which is some kind of compromise where the interface uses inclusive min and max bounds, so that all values can be reached. Just wanted to let you know that cfbot [1] doesn't seem to be happy with the patch. Apparently, some tests are falling. To be honest, I

Re: rand48 replacement

2021-09-24 Thread Aleksander Alekseev
Hi Fabien, > Attached a v10 which is some kind of compromise where the interface uses > inclusive min and max bounds, so that all values can be reached. Just wanted to let you know that cfbot [1] doesn't seem to be happy with the patch. Apparently, some tests are falling. To be honest, I didn't

Re: rand48 replacement

2021-07-08 Thread Fabien COELHO
Hello Yura, Given 99.99% cases will be in the likely case, branch predictor should eliminate decision cost. Hmmm. ISTM that a branch predictor should predict that unknown < small should probably be false, so a hint should be given that it is really true. Why? Branch predictor is history

Re: rand48 replacement

2021-07-08 Thread Fabien COELHO
Finally, I think it would be better to treat the upper bound of the range as inclusive. This bothered me as well, but the usual approach seems to use range as the number of values, so I was hesitant to depart from that. I'm still hesitant to go that way. Yeah, that bothered me too. For

Re: rand48 replacement

2021-07-08 Thread Dean Rasheed
On Thu, 8 Jul 2021 at 09:26, Fabien COELHO wrote: > > > Finally, I think it would be better to treat the upper bound of the > > range as inclusive. > > This bothered me as well, but the usual approach seems to use range as the > number of values, so I was hesitant to depart from that. I'm still >

Re: rand48 replacement

2021-07-08 Thread Fabien COELHO
Hello Dean, Whilst it has been interesting learning and discussing all these different techniques, I think it's probably best to stick with the bitmask method, rather than making the code too complex and difficult to follow. Yes. The bitmask method has the advantage of being very simple,

Re: rand48 replacement

2021-07-07 Thread Dean Rasheed
On Wed, 7 Jul 2021 at 04:00, Yura Sokolov wrote: > > Anyway, excuse me for heating this discussion cause of such > non-essential issue. > I'll try to control myself and don't proceed it further. > Whilst it has been interesting learning and discussing all these different techniques, I think it's

Re: rand48 replacement

2021-07-06 Thread Yura Sokolov
Fabien COELHO писал 2021-07-06 23:49: Hello Yura, However, I'm not enthousiastic at combining two methods depending on the range, the function looks complex enough without that, so I would suggest not to take this option. Also, the decision process adds to the average cost, which is

Re: rand48 replacement

2021-07-06 Thread Fabien COELHO
Hello Yura, However, I'm not enthousiastic at combining two methods depending on the range, the function looks complex enough without that, so I would suggest not to take this option. Also, the decision process adds to the average cost, which is undesirable. Given 99.99% cases will be in

Re: rand48 replacement

2021-07-06 Thread Yura Sokolov
Fabien COELHO писал 2021-07-06 09:13: Hello Yura, I believe most "range" values are small, much smaller than UINT32_MAX. In this case, according to [1] fastest method is Lemire's one (I'd take original version from [2]) [...] Yep. I share your point that the range is more often 32 bits.

Re: rand48 replacement

2021-07-06 Thread Fabien COELHO
Hello Yura, I believe most "range" values are small, much smaller than UINT32_MAX. In this case, according to [1] fastest method is Lemire's one (I'd take original version from [2]) [...] Yep. I share your point that the range is more often 32 bits. However, I'm not enthousiastic at

Re: rand48 replacement

2021-07-05 Thread Yura Sokolov
Fabien COELHO писал 2021-07-04 23:29: The important property of determinism is that if I set a seed, and then make an identical set of calls to the random API, the results will be identical every time, so that it's possible to write tests with predictable/repeatable results. Hmmm… I like my

Re: rand48 replacement

2021-07-04 Thread Fabien COELHO
The important property of determinism is that if I set a seed, and then make an identical set of calls to the random API, the results will be identical every time, so that it's possible to write tests with predictable/repeatable results. Hmmm… I like my stronger determinism definition more

Re: rand48 replacement

2021-07-04 Thread Dean Rasheed
On Sun, 4 Jul 2021 at 17:03, Fabien COELHO wrote: > > > As for determinism, the end result is still fully deterministic. > > The result is indeed deterministic of you call the function with the same > range. However, if you change the range value in one place then sometimes > the state can

Re: rand48 replacement

2021-07-04 Thread Fabien COELHO
Now suppose we want a random number in the range [0,6). This is what happens with your algorithm for each of the possible prng() return values: prng() returns 0 -- OK prng() returns 1 -- OK prng() returns 2 -- OK prng() returns 3 -- OK prng() returns 4 -- OK prng() returns 5 -- OK

Re: rand48 replacement

2021-07-04 Thread Dean Rasheed
On Sun, 4 Jul 2021 at 10:35, Fabien COELHO wrote: > > I did not understand why it is not correct. > Well, to make it easier to visualise, let's imagine our word size is just 3 bits instead of 64 bits, and that the basic prng() function generates numbers in the range [0,8). Similarly, imagine a

Re: rand48 replacement

2021-07-04 Thread Fabien COELHO
Hello Dean, - moves the stuff to common and fully removes random/srandom (Tom) - includes a range generation function based on the bitmask method (Dean) but iterates with splitmix so that the state always advances once (Me) At the risk of repeating myself: do *not* invent your own

Re: rand48 replacement

2021-07-04 Thread Dean Rasheed
On Sat, 3 Jul 2021 at 08:06, Fabien COELHO wrote: > > Here is a v4, which: > > - moves the stuff to common and fully removes random/srandom (Tom) > - includes a range generation function based on the bitmask method (Dean) > but iterates with splitmix so that the state always advances once

Re: rand48 replacement

2021-07-03 Thread Fabien COELHO
1. PostgreSQL source uses `uint64` and `uint32`, but not `uint64_t`/`uint32_t` Indeed you are right. Attached v6 does that as well. -- Fabien.diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2c2f149fb0..146b524076 100644 --- a/contrib/file_fdw/file_fdw.c +++

Re: rand48 replacement

2021-07-03 Thread Fabien COELHO
Hello Yura, 1. PostgreSQL source uses `uint64` and `uint32`, but not `uint64_t`/`uint32_t` 2. I don't see why pg_prng_state could not be `typedef uint64 pg_prng_state[2];` It could, but I do not see that as desirable. From an API design point of view we want something clean and abstract,

Re: rand48 replacement

2021-07-03 Thread Tom Lane
Yura Sokolov writes: > 2. I don't see why pg_prng_state could not be `typedef uint64 > pg_prng_state[2];` Please no. That sort of typedef behaves so weirdly that it's a foot-gun. regards, tom lane

Re: rand48 replacement

2021-07-03 Thread Yura Sokolov
Fabien COELHO wrote 2021-07-03 11:45: And a v5 where an unused test file does also compile if we insist. About patch: 1. PostgreSQL source uses `uint64` and `uint32`, but not `uint64_t`/`uint32_t` 2. I don't see why pg_prng_state could not be `typedef uint64 pg_prng_state[2];` 3. Then

Re: rand48 replacement

2021-07-03 Thread Fabien COELHO
Here is a v4, which: - moves the stuff to common and fully removes random/srandom (Tom) - includes a range generation function based on the bitmask method (Dean) but iterates with splitmix so that the state always advances once (Me) And a v5 where an unused test file does also compile if

Re: rand48 replacement

2021-07-03 Thread Fabien COELHO
Hello Dean & Tom, Here is a v4, which: - moves the stuff to common and fully removes random/srandom (Tom) - includes a range generation function based on the bitmask method (Dean) but iterates with splitmix so that the state always advances once (Me) -- Fabien.diff --git

Re: rand48 replacement

2021-07-02 Thread Fabien COELHO
Hello Dean, It may be true that the bias is of the same magnitude as FP multiply, but it is not of the same nature. With FP multiply, the more-likely-to-be-chosen values are more-or-less evenly distributed across the range, whereas modulo concentrates them all at one end, making it more

Re: rand48 replacement

2021-07-02 Thread Dean Rasheed
On Thu, 1 Jul 2021 at 22:18, Fabien COELHO wrote: > > > However, this patch seems to be replacing that with a simple > > modulo operation, which is perhaps the worst possible way to do it. > > The modulo operation is biased for large ranges close to the limit, sure. > Also, the bias is somehow of

Re: rand48 replacement

2021-07-01 Thread Fabien COELHO
Hello Dean, I haven't looked at the patch in detail, but one thing I object to is the code to choose a random integer in an arbitrary range. Thanks for bringing up this interesting question! Currently, this is done in pgbench by getrand(), which has its problems. Yes. That is one of the

Re: rand48 replacement

2021-07-01 Thread Fabien COELHO
Hello Tom, Indeed. Here is a blind attempt at fixing the build, I'll check later to see whether it works. It would help me if the cfbot results were integrated into the cf app. Hmm, not there yet per cfbot, not sure why not. I'll investigate. Anyway, after taking a very quick look at

Re: rand48 replacement

2021-07-01 Thread Dean Rasheed
On Thu, 1 Jul 2021 at 19:41, Tom Lane wrote: > > Anyway, after taking a very quick look at the patch itself, I've > got just one main objection: I don't approve of putting this in > port.h or src/port/. I haven't looked at the patch in detail, but one thing I object to is the code to choose a

Re: rand48 replacement

2021-07-01 Thread Tom Lane
Fabien COELHO writes: >> Although this patch is marked RFC, the cfbot shows it doesn't >> even compile on Windows. I think you missed updating Mkvcbuild.pm. > Indeed. Here is a blind attempt at fixing the build, I'll check later to > see whether it works. It would help me if the cfbot results

Re: rand48 replacement

2021-07-01 Thread Fabien COELHO
Although this patch is marked RFC, the cfbot shows it doesn't even compile on Windows. I think you missed updating Mkvcbuild.pm. Indeed. Here is a blind attempt at fixing the build, I'll check later to see whether it works. It would help me if the cfbot results were integrated into the cf

Re: rand48 replacement

2021-07-01 Thread Tom Lane
Although this patch is marked RFC, the cfbot shows it doesn't even compile on Windows. I think you missed updating Mkvcbuild.pm. regards, tom lane

Re: rand48 replacement

2021-05-24 Thread Fabien COELHO
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Although the patch looks OK I would like to keep the

Re: rand48 replacement

2021-05-24 Thread Aleksander Alekseev
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Although the patch looks OK I would like to keep the status

Re: rand48 replacement

2021-05-24 Thread Aleksander Alekseev
Sorry for a duplicate entry on CF web application

Re: rand48 replacement

2021-05-24 Thread Fabien COELHO
Hello Aleksander, - better software engineering - similar speed (slightly slower) - better statistical quality - quite small state - soundness Personally, I think your patch is great. Thanks for having a look! Speaking of the speed I believe we should consider the performance

Re: rand48 replacement

2021-05-24 Thread Fabien COELHO
Hello Andrey, - NOT to invent a new design! Radical version of this argument would be to use de-facto standard and ubiquitous MT19937. Indeed, I started considering this one for this reason, obviously. Though, I suspect, it's not optimal solution to the date. "not optimal" does not

Re: rand48 replacement

2021-05-24 Thread Aleksander Alekseev
Hi Fabien, > To summarize: > - better software engineering > - similar speed (slightly slower) > - better statistical quality > - quite small state > - soundness Personally, I think your patch is great. Speaking of the speed I believe we should consider the performance of the entire

Re: rand48 replacement

2021-05-24 Thread Andrey Borodin
> 24 мая 2021 г., в 15:31, Fabien COELHO написал(а): > > > - NOT to invent a new design! Radical version of this argument would be to use de-facto standard and ubiquitous MT19937. Though, I suspect, it's not optimal solution to the date. Best regards, Andrey Borodin.

Re: rand48 replacement

2021-05-24 Thread Fabien COELHO
Hello Tomas, I have given a go at proposing a replacement for rand48. So what is the motivation for replacing rand48? Speed, quality of produced random numbers, features rand48 can't provide, or what? Speed can only be near rand48, see below. Quality (eg no trivial cycles, does not fail

Re: rand48 replacement

2021-05-24 Thread Tomas Vondra
Hi, On 5/24/21 12:31 PM, Fabien COELHO wrote: Hello pg-devs, I have given a go at proposing a replacement for rand48. So what is the motivation for replacing rand48? Speed, quality of produced random numbers, features rand48 can't provide, or what? POSIX 1988 (?) rand48 is a LCG PRNG

rand48 replacement

2021-05-24 Thread Fabien COELHO
Hello pg-devs, I have given a go at proposing a replacement for rand48. POSIX 1988 (?) rand48 is a LCG PRNG designed to generate 32 bits integers or floats based on a 48 bits state on 16 or 32 bits architectures. LCG cycles on the low bits, which can be quite annoying. Given that we run on